Fixing a disk

Copyright Dr Alan Solomon (1986-1995)

Everyone has tools for fixing disks and recovering data.  These range
from CHKDSK , RECOVER and DEBUG, which come free with DOS (some
versions of DOS don't give you DEBUG) to Norton, Mace, PCTools and
Undelete, which cost between #15 and over #100.  The trouble with
tools, is that people sometimes don't know how to use them, or use
them for inappropriate purposes.  And some of the more powerful tools
are a bit like letting an amateur run riot with a chainsaw in an
antique restoration shop, or like giving a pneumatic drill to an
trainee brain surgeon.

An excellent example is RECOVER.  I get about one RECOVERed disk per
month for data recovery;  usually the user started out with some
simple problem, ran RECOVER, and now his whole disk is useless.
Recover doesn't undelete files;  the manual tells you what it actually
does.  But some manufacturers have provided RECOVER but not documented
it in the manual, so people run it on spec.  Then they can't believe
that running it can have done so much damage.

Charles explained to me what happened to him.  He lost a file, he
said.  I'm not sure what that means;  it could mean he'd deleted it,
or else it could mean he'd forgotten where it was - I try not to
assume that the words that people use mean the same as they do when I
use them.

He had an Opus, so he phone Opus technical support.  "Lost a file?
Run Norton." they said.  "Norton?" said Charles.  "What's Norton?".
At that point, Opus technical support should have realised that they
were dealing with a beginner, and should have told him to get support
from someone local.  Instead, the person manning the Opus help line
said something incredibly stupid.  "OK, run RECOVER then." Charles did
as he was told.  Imagine his surprise when he found that every file on
his hard disk had been converted to FILE0000.REC, FILE0001.REC,
FILE0002.REC and so on.  Hundreds of these .REC files, and no trace of
his data.  So Charles did what he should have done in the first place
- he called his in-house PC support team.  Did I mention that Charles
worked in a big bank, and there are PC specialists at his beck and
call?

The PC support team arrived, carrying things a bit more advanced than
RECOVER - he had Norton.  Unfortunately, there is nothing in the
Norton manual to cover hundreds of .REC files.  So he did what he
could.  First, he created a new subdirectory called \REC, and then he
copied all the .REC files into that subdirectory.  Let me tell you,
rule one about doing a data recovery, is never, never copy anything
onto the hard disk, as you never know where it might get copied.  You
could easily be copying over the very data that you're trying to
rescue.  PC support had good intentions - they were trying to make a
backup of the .REC files before working on them.  But separate
subdirectories don't stop data being copied on top of what you're
trying to get back, as there is only one disk.

The next thing he did was to reinstall the system.  A lot of people do
that when they're trying to get data back, and I've never really
understood why.  I suppose it is something that is vaguely low level,
that they know how to do.  Of course, reinstalling the system didn't
bring back any files, it just wrote on top of a bit more of the disk.
Then PC support did what he should probably have dome in the first
place - he called for help.

It was at that point that I met the disk, and I can tell you that
after everything that had happened to it, I was pretty pessimistic
about getting Charles' database back.  Nor did it help that he used
IBM's Filing Assistant, which doesn't store data in ASCII, and uses an
incredibly complicated file structure, which isn't documented anywhere
except in my own little filing cabinet.  Charles had pretty much given
up, too, except that he didn't have any kind of backup, and he not
only wanted his deleted file, he now also wanted all the stuff that
RECOVER had hit as well.

It took me a couple of day's hard work, and I had to write a couple of
new programs to do it, but at the end of that time, I had Charles'
disk looking as if nothing had ever happened to it.  I also phoned up
a friend of mine fairly high up in Opus, and got that particular
technical support person moved to where he couldn't do any more
damage.

So what does RECOVER do?  If a diskette is damaged, then you might be
able to read the start of the file, and then you get to the damaged
bit.  The rest of the file might be fine, but DOS won't read past the
damaged bit.  RECOVER patches around the damage, so that you lose the
damaged bit, but you have the rest of the file.  This is pretty
useless for any highly structured file, like a spreadsheet, or an IBM
Filing Assistant database, but could be useful for text files if the
word processor uses a very simple file format, like Wordstar.  But
never, never run RECOVER on a hard disk.

Mace is a useful program, sometimes.  But sometimes not.  The trouble
with any program that goes below DOS to work, is that it has to make
assumptions, because once you go below DOS, you're in a kind of
uncharted never-never land.  Different program have to make the same
assumptions about how things should work, and sometimes they make
different assumptions about the same things.  Or to put it simply, one
program might assume that all disks have 512 byte sectors (because
they all do) and another program might get round that assumption in
order to make it possible to have volumes that are greater than 32mb,
and somehow kludge up 2k sectors.

Carol was using a 110mb disk, because she had a very big database.
Dos 3.2 only allows 32 mb volumes, and her data file was about 12mb.
It's a big of a nuisance to have a file with is a substantial
percentage of a disk's total space - it makes indexing it, sorting it
and creating subsets of it more difficult.

So her dealer suggested that she use Vfeature, which lets you use
large disks like this as a single volume of 110 mb and Carol went on
happily adding to her database.

One day, her PC support people suggested to her that things would run
a lot faster if the file were contiguous.  When Dos allocates disk
space for a file, it uses the lowest-numbered available cluster.  So a
file can start off at the middle of the disk, and grow towards the
centre.  But then, you might delete something at the edge, and then
the file starts using that space.  When that is filled, it will resume
growing towards the centre, but then it might get some more space at
the edge, and so on.  The end result is, that the file is scattered
all over the disk, and when ever the database is searched, the disk
read head is skittering back and forth reading the file, which slows
things down.

It should be much faster to access a file if it is all contiguous, so
that the read head never has to move far to get the next part of the
file.  There are various programs on the market that let you make all
your files contiguous (they are called defragmenters, organisers,
tidiers or whatever).  Carol got Mace, and ran that.

Mace works just fine, almost every time.  If there was a program that
trashed disks as often as one time in a thousand, it would soon be
withdrawn from the market while the bugs were fixed.  But Carol was
unlucky.  Mace maced her disk.  When I saw it, it looked as if someone
had shuffled it, just like a pack of cards, with each card being
131072 bytes of data.  The shuffle wasn't systematic, though - it was
pretty random.

"What about your backup, Carol?", I asked.  That was a pretty sad
story, too.  She'd been backing up onto a tape streamer, but the tape
streamer had stopped working, and was away for repair.  When it
returned, it worked fine, but it couldn't read any of its old tapes,
and neither could any other tape drive.  That is commoner than you
might think.  Tape drives are supposed to retry up to 12 times before
giving up on reading a block of data, and so even if you have a verify
after the write, the reading can be pretty marginal.  And every tape
streamer program I've ever seen will give up if there are parts of the
tape that it cannot read.

I attacked Carol's problem on two fronts.  I started working on the
shuffled disk, with a view to putting the 128k chunks back into the
right order.  But after working with this for a while, I found that
quite a lot of over-writing had gone on, and only about 10% of the
data was actually there.  There were about ten copies of that 10%, but
that wasn't exactly useful.  So I had a go at the tapes.  I wrote a
program that can read tapes, irrespective of the damage that is on
them, and that has proved to be useful subsequently.  But it didn't
help with Carol's tapes.  I found that the problem was that the drive
head had gone out of alignment, so that the nine tracks on the tape
were not where they ought to be.  In order to read these tapes, I
would have to deliberately misalign the heads on my own drive (or
else borrow one to misalign).  I borrowed a drive from Carol's
company, which is one of the biggest in the UK, and played around with
it.  I soon found that, although there is only one correct alignment,
there are an infinite number of wrong ones, and I wasn't getting any
sort of clue how close I was to the right setting.  I decided that I
wasn't going to be able to read these tapes.

I gave Carol back the 10% of her database that I was able to rescue,
and she was grateful for that - 10% of a 12 mb database is still quite
a lot of data.  But the rest is going to have to be retyped, and I
feel sorry for whoever has to do it - I think it's Carol.

The next tale is a sad case of a Nortoning.  Norton have just issued a
press release, detailing some of the circumstances under which their
products can mess up a hard disk - if you want a copy of it, contact
the Hill and Nolton PR agency.  Robert Schifreen has a sad tale of
Nortoning to tell - he reviewed it for a magazine, and in doing so he
managed to scramble his hard disk at the office and his hard disk at
home.  The editor couldn't believe it, so tried it out himself, and
his hard disk duly went west also.  But this particular case of
Nortoning quite surprised me.

I was contacted by a dealer.  He had visited a customer, because they
were reporting that they couldn't reformat paragraphs in their word
processor.  What he found, horrified him.  They hadn't taken a backup
in seven months, he said, and the disk was very untidy.  The first
thing he did, was copy Norton onto the disk, and run DIRSORT to tidy
up the directory.  DIRSORT sorts the directory entries into alphabetic
order;  I quite often see directories that have been made neat and
tidy that way.  Then he made a backup.

Good idea, but he did it in the wrong order.  DIRSORT had already
Nortoned the disk, and when he looked at the documents, they had
turned into bits of program instead of letters to customers.  Some of
the bits of program were Norton, to add insult to injury.  The backup,
of course, consisted of the same garbage.

The customer was peeved, extremely peeved.  They'd started off with a
fairly minor annoyance, and the dealer had turned it into a major
disaster.  At this point, the dealer thought of using Norton to try to
fix this mess, but he didn't think about it for very long, and he
contacted me.  He told me that he had a few scrambled files on a
backup disk, and I told him to send it along.  I didn't talk to him
for very long, because in my experience, it is best to talk to people
after they've sent their data disasters, so that you can have a look
at it first.  Plus, of course, some of them decide not to send them
for recovery, which means that the half-hour or so spent talking to
them is (from my point of view) time wasted.

The diskette that arrived was full of bits of Norton, but the letter
that accompanied it made it clear why.  I couldn't recover the
customer's documents by working on that diskette - the data simply
wasn't there (I looked).  I spoke to the dealer, and explained that to
recover these documents, I really needed the hard disk, which would
have his customer's documents on there somewhere, and it was mostly a
matter of finding them, and correcting the pointers that had been
Nortoned to point in the wrong direction.

This is where the situation stands right now.  My fee for doing a hard
disk data recovery is pretty stiff, and the customer, quite rightly,
feels that he shouldn't have to pay to sort out a problem that his
dealer caused.  The dealer is blaming the customer on the one hand for
not doing a backup (and quite rightly so) and is blaming Norton on the
other hand for Nortoning the disk (again, you can see his point of
view).  Until someone somewhere decides that they are willing to pay
my fees, I can't really do very much.

The fourth case is one that involves DEBUG, although any powerful disk
tinkering tool would have led to the same problem.  I have to warn
you, though, that I don't fully understand two important areas with
this case.

Tim came to me with a disk that had two files on it, and he assured me
that yesterday it had hundreds.  First, he explained why there was no
backup, and that was the first bit that was so complicated, I couldn't
understand it.  Apparently, DOS BACKUP wouldn't work on this computer,
or perhaps it was RESTORE that didn't work, and he'd obviously never
heard of COPY.  Anyway, there was some really excellent reason why
there had been no backup made for over a year, and I didn't feel that
it was worth arguing, or even getting to the bottom of the alleged
problem with BACKUP.  I felt that I would have little difficulty
getting him to make backups in future, using some means or other.

The other complicated story he told me, was about having to boot
sometimes under DOS 2, and sometimes under DOS 3.  Again, although I
listened to his explanation of why this was necessary several times
over, I never really grasped what he thought this was accomplishing.
He found, though, that booting the same computer under DOS 2 and DOS 3
isn't as simple as it you might have thought, and for reasons that I
really couldn't follow, he had to get rid of his boot sector.  The
boot sector is the area on the hard disk that initiates the boot-up
process;  it also stores vital information like the number of sectors
on the disk, the size of the clusters, the length of the File
Allocation Table (FAT), the number of possible entries in the root
directory, and so on.  I hadn't actually realised you could run a hard
disk with a blank boot sector, and I certainly haven't seen one
before.  I asked him how he got rid of the boot sector, but he'd done
it about a year ago, and he'd used DEBUG, but couldn't exactly
remember how.  I know how - DEBUG lets you load absolute sectors from
the disk, and the boot sector is number zero.  It also lets you change
them, and write them back.  You can do unlimited damage with DEBUG,
although it obviously isn't as user friendly as PC Tools, for example.

He told me that one day, he booted from his DOS 2 disk, but forgot to
check the disk before writing to it.  "Why do you have to check the
disk?" I asked.  "Well, sometimes it doesn't seem to recognise it
properly," he replied, "and then you have to reboot." So he knew he
was skating on thin ice, and without a backup, it was just a matter of
time before the disaster.

I looked at the disk.  His boot sector was all zeroes, sure enough.
His FAT looked normal, except that there was a third copy of the first
part of it, and I've never seen that before.  The directory was
somewhat messed up, too, as that extra FAT sector was written where
the directory should have started.

I straightened out the FAT, and rebuilt a new first directory sector,
by hand.  I got all the files off the disk, then completely
reinstalled DOS, and copied all the files back on.  This time, I
installed DOS 3.2, with a proper boot sector, and made him promise to
do proper backups - I even gave him software to do it with.  Imagine
my surprise when he phoned me a few days later.  Apparently, while
copying his spreadsheets on to his file server, he'd managed to
overwrite some of them.  And he still didn't have a backup, and could
I give him another copy?  Yes, I could, as I still had my copy of his
data, but I'll be erasing that in a month or so, and what will he do
then?  He won't do any backups, that's for sure.  After all, he's so
clever, he knows how to zero a boot sector using DEBUG.