Hello Grthumongous,
OK. I'll first describe a method that worked "way back when" and then
summarize how more modern disk duplicators work.
I ended up writing the program because a customer test failed. When we
did the analysis, it turns out the boot sector was not initialized. It
turns out the "copy volume" utility we were using on a system did not
initialize the boot sector. Sigh.
The copy program itself was quite simple. It had a look up table of
the disk types we used (300 Mbyte, 13 Mbyte, 5 Mbyte) and based on the
type of disk - looked up the geometry of the disk
- number of cylinders
- number of platters
- number of blocks per cylinder
Then there were three nested loops, one for the cylinder, one for the
platter, and the third for the block within the cylinder / platter.
Within the innermost loop, the steps were
- write formatting information to the block on destination disk
- read block from source disk
- write block to destination disk
The format step was needed since we occasionally received blank disks
which had no formatting information included.
The compare disk program was the same code with the innermost loop replaced with:
- read block from source disk
- read block from destination disk
- compare the two blocks, byte by byte
and generating an error message if any byte did not compare.
These programs worked quite efficiently. The maximum data transfer
rates for the disks was about 3 Mbyte/sec and with the largest disk
(300 Mbyte), the programs would take a few minutes to complete. After
we wrote these two programs, we never had the same problem with
customer testing again.
That was in the mid 80's. Today, disks are a lot faster and store a
lot more information but the basic approach is the same. The drive
mechanisms are a lot better now as well - for example, there is "bad
block replacement" where the drive will automatically stop using a bad
(or marginal) disk block and replace it with an unused block. In this
way - operating systems (and hardware duplicators) generally treat the
disk as a sequence of blocks starting at zero and going up to the
maximum block number. That simplifies the design of disk duplication
programs quite a bit.
For example, on a Linux (or Unix) system, you can use a command such as
dd if=/dev/hda of=/dev/hdb bs=32768
to copy disk "/dev/hda" to "/dev/hdb" and can use a command such as
cmp /dev/hda /dev/hdb
to compare two disks. Neither program generate any messages unless an
error occurs. I have actually used commands like these within the last
few years and they work just fine.
As another example, you can download a package such as "Tom's Root Boot Disk"
http://www.toms.net/rb/
or described briefly at
http://www.toms.net/rb/tomsrtbt.FAQ
which describes how to download and create a boot floppy that has the
utilities described above. In this way, you can take any old PC and
turn it into a "poor man's" disk duplicator.
For hardware disk duplicators, the process is basically the same
- copy the data as fast as you can from one disk to another
- compare the data as fast as you can (block by block / byte by byte)
though some devices are certainly more capable when they understand
the format of the file system they are copying [but then it won't
necessarily be an identical copy].
Here are a few examples:
http://www.aberdeeninc.com/abcatg/HDP620.htm
shows a one to two disk duplicator that also understands common
Microsoft Windows disk formats and can "scale" the disk partitions to
fit the destination drive. An option near the bottom also understands
the disk format and only copies the blocks in use.
http://www.ics-iq.com/show_item_267.cfm
shows a three disk duplicator that runs about 1000 times faster than
the old program I wrote.
http://www.greystoneds.com/downloads/dat600.pdf
a one to six disk duplicator which can use a PC (nice menus, etc.) to
control the duplication, report errors, etc.
A few software examples include:
http://www.softforall.com/Utilities/Backup/R-Drive_Image_Hard_Disk_Backup_Software09020020.htm
which is a more general utility that generates "disk image" files
which can be put on any media and then used to clone disk (or recover
from a disk failure).
http://www.symantec.com/sabu/ghost/ghost_personal/
Norton "Ghost" which is a very capable utility for duplicating /
compare disks. Scroll down and see several tutorials if you want to do
some advanced duplicating tasks.
For additional information on two Linux programs I referred to, check out
http://www.die.net/doc/linux/man/man1/dd.1.html
http://www.die.net/doc/linux/man/man1/cmp.1.html
or search for
man page cmp linux
man page dd linux
To search for more general information on this topic, try phrases such as
disk image duplicator
hard disk duplicator software
hard disk duplicator hardware
[product name here] features
[product name here] problems
If this is not enough on the topic or some part is unclear, please
make a clarification request.
--Maniac |
Clarification of Answer by
maniac-ga
on
14 Apr 2004 04:57 PDT
Hello Grthumongous,
If a third party needs to confirm the results (without a hash), the choices are:
- that third party would repeat the comparison of the original disk to the copy
OR
- that third party would review the process used to make the
duplicate (to ensure it works properly) and
- the duplication company would follow that approved process (and be audited)
This latter method in brief is what companies that get ISO 9001
certification go through.
Note however, the use of a hash (e.g., MD5, simple checksum) does not
guarantee that the original matches the copy. It only guarantees that
the hash is the same. It protects against accidental (or casual)
modification of the copy - but not a determined effort by a "bad guy".
What the bad guy can do is this:
- compute the hash value before modifying the copy
- make any modifications he wants to the copy
- compute the new hash value after modification
- find a location in the copy that is not used / modify it so the new
hash value matches the original value
This kind of technique is sometimes used [in a positive way] to apply
patches to flight software. If you have to patch (modify) the
executable in an Operational Flight Program (OFP), you also have to
update the checksum value after the patch is applied - so when the OFP
starts the next time, it will get the proper result from its power on
self test (which verifies the OFP is not damaged).
So in some ways - the use of a hash as a "control number" may be more
convenient (does not require the original disk) in a comparison but is
not as secure as a complete comparison of the data on disk.
--Maniac
|