[This is my second excerpt from my The 7 Deadly Sins of Backup and Recovery whitepaper.]
Allowing Backups To Go Untested
Organizations often spend a ton of time making backups. If the backup volumes being created cannot be reliably restored, the process has failed. The basic rule of disaster recovery systems is that backups are unproven until they have been fully tested and shown to be effective on a frequent and ongoing basis.
The typical organization checks that: backup schedules are correct, backup jobs ran, and verifies that backups completed successfully in the error logs. With those confirmations in hand, we assume that data recovery will be possible when necessary. However, media degrades over time, operating environments change making it harder to successfully restore older backups, and the error rates associated with the used media make it uncertain that the restoration will actually work.
A solid disaster recovery plan must include redundant backups intended to compensate for normal error rates, and they must incorporate time factors that reflect real world data from actual test restorations.
Why do most organizations not fully test their backups? The answer is simple: too little time to test them.
A surprising number of organizations still use tape as their backup medium. The use of tape requires an enormous amount of time to build, test, and validate a backup. Why? Because simply restoring the data from tape isn’t enough. The IT staff will have to start with a new server and then locate registry, operating system, and environmental component CDs. Next, they’ll have to re-install all applications before restoring user data from tape. Only if each of these steps works correctly can the organization be confident that data can be successfully restored in a true disaster.
The cure for this deadly sin of not testing backups is to have both the right technology and to implement best practices. The right technology means using disk-based backup systems, incorporating modern bare metal backup technology.
It’s well understood that disk-based systems are faster and provide a more reliable data storage medium than tape systems. Bare metal technology allows a full snapshot of a system’s operating environment and allows that environment to be restored in a fraction of the time necessary to rebuild it from scratch. Using the right technology sets the stage for successful restorations.
The final element to avoid this deadly sin is to implement today’s best practices for data capture and restoration testing. These include performing a full bare metal restoration of each critical server no less frequently than four times each year and testing a random file or folder on each server at least twice a year.
Best practices also include the concept of testing something each month since operating systems and environments change rapidly enough that data could have been compromised in the current backup resources.
Finally, one more noteworthy best practice is to capture and analyze the data that the organization gathers about error rates of the test restorations and the time required to conduct them. That way, an organization’s IT staff can feed this knowledge into their disaster recovery plan, so that they can feel more assured that their plan will actually deliver when the chips are down.