True "Reread/Compare" of new Image File or Restored Partition


Author
Message
Harrison Scofield
Harrison Scofield
New Member
New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)
Group: Forum Members
Posts: 14, Visits: 32
There are some types of "intermittent" computer errors which are only detectable by rereading the new Image File or Restored Partition and comparing it against the original source used to create it.  The first question that needs answering, however, is whether anyone would actually USE such an option to further guarantee the "integrity" of the Backup Image File or Restored Partition?  But, before you answer, be aware of the following:
     1) It would DOUBLE the time required to Backup or Restore a partition!
     2) It would require VSS for Backup under Windows or maybe even a "Boot into PE".
         And, operations under PE are significantly slower than under Windows!

For me, "integrity" of the Backup/Restore process is of paramount importance.  I have data that is twenty years old that I do not want to loose or become "corrupted" because I had an "intermittent" but undetected hardware error (which HAS happened to me on two separate occasions)!  And, because I may do a half-dozen Restore of the OS Partition everyday, these MR functions just have to be as "perfect" as possible.  So, I am more than willing to wait the extra time for additional assurance that they are as accurate as possible "within the limitations of software detection".  Does anyone else feel that "integrity" is so important that they would actually USE such an option knowing that it would require at least twice as long for the Backup or Restore to complete?

Note to MR Support: I am NOT suggesting a Read following every Write which would be a performance disaster so please set aside any implementation criticisms for the moment.  I would only like to determine whether such an option would actually be USED enough to even be considered for adding to MR.

Nick
Nick
Macrium Representative
Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)
Group: Administrators
Posts: 1.7K, Visits: 9.3K
Harrison

When an image is created an MD5 digest (hash) of the source data is computed immediately after it is read from the source disk. Importantly, this digest is computed from the source disk not from the data written. This hash is then saved in an index at the end of the image file. When you verify an image the MD5 digest is recomputed from the data in the image file and compared with the original source digest. If the comparison is different then verification fails.

Re-Read/Compare is an inefficient and poor substitute. If you Re-Read the source data and have a different result from the first read  then there is a latent memory issue or a source disk problem so the backup fails and you end up with no backup! Not a positive result. At least by using the current method you have a verified backup of the data that was readable at the time the image was created. 

For further verification you can mount the image, run chkdsk on the mounted file system, or use viBoot to boot into a system image:

http://knowledgebase.macrium.com/display/KNOW/Macrium+viBoot


Kind Regards

Nick - Macrium Support

Edited 24 July 2015 11:16 PM by Nick
Harrison Scofield
Harrison Scofield
New Member
New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)
Group: Forum Members
Posts: 14, Visits: 32
Nick,  thank you for your response but I was aware of the existing validity checking in MR.  Unfortunately, it does not detect as many errors as "rereading and comparing" would.  The reason that I doggedly pursued this so hard is because on TWO separate occasions, I had an "intermittent" memory DIMM fault during which time MR Backup/Restore produced "corrupted" partitions without any error messages!  And, once I had been alerted to a problem, it took me months to definitively diagnose the problem.

Since the NT File System does not employ CRC and there are so many layers of OS services between the hardware and MR, the absolute BEST it, or any other software product, can do after creating a new Image File or Restored Partition is to read it again to compare against the source used to create it.  If there is any discrepancy at all, something undesirable occurred somewhere.  I hardly consider that "an inefficient and poor substitute" when it can detect errors that are undetectable by the existing MR code!  And, since the Image File or Restored Partition has already been created, there is nothing preventing MR from retaining it although I would certainly like a SEVERE warning message that it may be "corrupt".

For unfortunate historical reasons, the PC evolved into two grades of machines: consumer and commercial.  The vast majority of machines are and always will be consumer grade lacking such things as memory PARITY/ECC and the like.  And, assuming you can even find and are willing to pay the premium for a commercial machine, even they lack many of the "data integrity" features found on mainframes.  I also find it ironic that manufactures of HDDs have completely reversed these distinctions between their consumer and commercial grade products: Their consumer products try to recover "at all cost" while the commercial products immediately raise an error!  I cannot help wondering what information they had other than RAID requirements that caused them to diverge like this?  However, this is the arena in which MR operates so the product should be adapted accordingly.

Except for one study by Google on memory errors on their server farm machines, I am unaware of any studies that have been done on the nature of errors and the frequency of their occurences on PCs.  So, it is not unexpected that there is differing opinion on this subject by different technical people.  Everything done in software has a cost-benefit.  Due to lack of data, it is difficult to accurately estimate the benefit to customers and MR of implementing this suggestion.  The development and maintenance costs are much easier to quantify.  And, as with any software product, there are probably any number of other requirements and suggestions competing for the limited available resources.  I posted this suggestion hoping to generate comments from USERS as much as from MR.  The most important question to be answered is, "If there were such a feature, would you even USE it knowing that it would DOUBLE the time required to do a Backup/Restore?"  Without this information, no case can be made for adding such an option.

I have said everything I wanted so I will end it here.  Windows 10 becomes officially available in a few days so I will go torture it for awhile!  Wink  --- Harrison


Seekforever
Seekforever
Expert
Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)
Group: Awaiting Activation
Posts: 461, Visits: 5.6K
If my understanding of Nick's reply is correct, I see little room where a RAM error could affect the integrity check. The checksum is computed independently from data coming off the disk, not the data read into RAM to be included in the image, so a RAM error in the either the data block or the area where the checksum is being computed should result in a different checksum which would be flagged in the Verify.

For those using live imaging rather than imaging a static disk, keeping track of disk changes during image creation adds another level of complexity to the re-read process since the disk has changed from the time of the image creation. This is not impossible to do but it is another issue.

I've used 4 different imaging programs over the years and they've all used this method. I've participated in several forums and I don't recall issues with bad data being restored. I admit that files could be restored with incorrect contents and they would not be detected if never accessed. Most of the restoration issues were due to failed verifications or restorations due to RAM errors or disk read errors caused either by the disk or bad cables. These were cases where the image had not been verified immediately after creation or in the case of products using Linux recovery CDs, the Linux would not read the archive properly in some cases but Windows would.  I had to go back about 3 images on a HD to find one that would restore because of bad clusters that developed some time after the image was created. I also had a verification failure due to a bad SATA cable - interestingly, this had been picked up earlier in the Windows Event Viewer and it actually told me to replace the cable! None of my problems occurred with Reflect.

My view on backups is that the only data that really needs to be secure is my personally created data that is available nowhere else at any cost. I can always reinstall Windows and applications from scratch if necessary. For this reason, I try to diversify although not totally. For my data files I have backups on USB drives that I rotate, an internal HD, a NAS and I write a few selected files to the cloud. I also write data files to a HD 2 or 3 times a year which I keep at a neighbour's house; don't bother with images since it I need that drive it means they stole all my computers or my house burned down. The multiple backups means that intermittent errors as you describe may not be in all copies. I also do not subscribe to keeping minimum number of backups. I let them build until the backup disk is out of space so I have a long history.

I don't use Reflect for my data files, C drive OS/Apps only,  although I consider it to be quite adequate, I use a program that writes the backups with versioning in its native files and folders format and it does re-reads for verifying. The re-read verification is not the reason I prefer it; I just don't like container file backups for data files. I know some people do run two different backup programs strictly for data security.

My personal view of the re-read/compare request is that my experience says it isn't required and while it may offer a minor improvement in security it may have the unintended consequence of people making fewer backups due to the time penalty although I don't know if it would be that much slower than including the auto-verify.








Edited 26 July 2015 4:15 PM by Seekforever
Harrison Scofield
Harrison Scofield
New Member
New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)New Member (24 reputation)
Group: Forum Members
Posts: 14, Visits: 32
I only wish that NTFS contained a CRC so that MR had something to guarantee integrity from "HDD to HDD".  MR does the best they can by computing one "some time after" the partition data has been read into a buffer by Backup and then using that on subsequent processing until Restore puts the data into a buffer to be written to the new partition.  My suggestion would guarantee that the data in newly created Image Files or Restored Partitions was identical to the source from which they were derived.  That is the absolute best that software can do!  Of course, it could not detect "permanent" errors on the HDD which are not caught by the devices CRC/ECC.  However, HDD manufactures do everything possible to recover on consumer HDDs before raising an error.

Verify only recomputes and verifies information in the Image File created by Backup.  So, it cannot detect situations which could be detectable by my suggestion.  However, it utilizes a highly buffered I/O process which stresses machine resources.  The only reason that I became aware of a potential problem on my system which could have been in MR was because Verify would very infrequently fail processing an Image File that had previously Verified good.  But, when I reran the Verify, it was good again.  At that point, I knew there was a problem but not where.  It took me a lot of effort to isolate the problem to a memory DIMM with an "intermittent" fault.  Only a special memory diagnostic program that I downloaded was able to detect it.

Your comments about having to go back three images is very relevant.  It is absolutely essential to detect "intermittent" problems as early as possible to avoid the exact situation you encountered.  Unfortunately, there are also "internal inconsistencies" that can develop over time in the File System which are beyond the bounds of MR Backup/Restore to detect.  So, it is entirely possible that the situation arises where all of your Backups contain the "corruption".  Since this is a different topic, I won't discuss it except to say that MR has "Browse Image" and "Browse Backup" functions that would assist in "salvaging" your critical user data from a "corrupted" File System.

I could not tell from your post whether you have SEPARATE partitions for your OS and USER data.  In my opinion, there is nothing more important than isolating your USER data in such a manner.  As you indicate, it is the USER data that is the most precious and what you really don't want to loose.  I carry this strategy to the extreme by configurating everything so I can Restore my OS partition in less than TEN MINUTES and continue as if nothing happened.  I have found this to be the best solution to dealing with malware.  But, that is another topic also.

Nick
Nick
Macrium Representative
Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)Macrium Representative (3K reputation)
Group: Administrators
Posts: 1.7K, Visits: 9.3K
Hi Harrison

Thank you for getting back. Your comments are very valid and introduce a different level of system integrity to image verification.

MR does the best they can by computing one "some time after" the partition data has been read into a buffer by Backup and then using that on subsequent processing until Restore puts the data into a buffer to be written to the new partition

Verify only recomputes and verifies information in the Image File created by Backup

I feel I must clear up a possible misunderstanding as to the method and purpose of image verification:

The MD5 digests stored in the image file are calculated from the source disk read buffer as soon as data is read. It is not derived from the image file write buffer. When an image file is verified the data in the image file read back and used to re-calculate another MD5 digest that's compared with the original digest that was calculated when the data was read. This guarantees that the image is an accurate representation of the data that was read from the disk at that time. 

Now, if the accuracy of data read from source disk cannot be guaranteed due to memory or other hardware problems then this is not actually a problem that can guarantee detection when imaging, even if the data is read back and compared.  If the source (or target disk) is not reading data with confidence then the system has a fundamental problem that might only manifest itself  in a low percentage of reads of the same data (so a single re-read is not enough). What could help in this case is a 'System Verification' function (not image) that attempts to stress the system by reading, writing, reading back and calculating data hashes randomly over the entire disk not once or twice but perhaps tens or hundreds of times. This could take a very long time, perhaps many days to run, but only needs to be run once (if at all), or rarely, to increase confidence levels in the system reliability. I stress that this is not an image verification function but a system verification that wouldn't be required with the same frequency as backups.
 
We are always looking to improve Macrium reflect and we may introduce such functionality in a future product.

Kind Regards

Nick - Macrium Support

Edited 26 July 2015 10:45 PM by Nick
Seekforever
Seekforever
Expert
Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)Expert (718 reputation)
Group: Awaiting Activation
Posts: 461, Visits: 5.6K

Yes, my data is on a separate partition, actually a separate HD, since the OS/Apps C partition is alone on a SSD. Having the OS/apps in a separate partition on the original HD also made going to a 120GB SSD very easy! 

Nothing in the program would have helped with my need to go back 3 images to find a good one since the drive itself developed bad sectors sometime after it was correctly written which says that backups on multiple devices is essential. The reason I had to go back 3 was that still gave me a more recent image than I had on  the other backup drive. This woke me up and I now pay more attention to rotating the backup drives.

I have often thought that a chkdsk should be run before imaging/backing up - or at least more frequently than I do, in case the file structure is damaged.




GO

Merge Selected

Merge into selected topic...



Merge into merge target...



Merge into a specific topic ID...




Similar Topics

Reading This Topic

Login

Explore
Messages
Mentions
Search