Disaster Recovery Testing 101

A monkey caused a nationwide power outage in Kenya, disrupting millions of businesses. This just goes to show that a disaster can strike in any form, at any time.  

Disruptions are followed by expensive unplanned downtime. Business operations come to a screeching halt, resulting in idle IT administrators, slow or no service delivery, and revenue loss. The cost of downtime adds up exponentially to the point that 40 percent to 60 percent of businesses do not survive a major disaster.  

That’s where a disaster recovery (DR) plan comes in. A well-crafted DR plan increases the possibility of a business recovering lost data and resuming normal operations in a jiffy. In many ways, an IT disaster recovery plan is a core tenant of an effective business continuity plan. If a business lacks a DR plan, it risks losing revenue and reputation.  

But do you know what’s worse than having no disaster recovery plan? Having one that is not tested.  

A well-crafted DR plan needs to move off paper and put into action. Just think about it. You put in the hours and resources to come up with a good disaster recovery plan only to find out it’s far from “good” right when you’re in the middle of a disaster, which is exactly when you need your DR plan to work.   

Let’s dig a bit deeper and understand the fundamentals of disaster recovery testing.

What is Disaster Recovery Planning? 

An IT disaster recovery plan is part of a business continuity plan consisting of technologies and best practices to recover lost data and system functionality that allows a business to operate in case of a natural or manmade disaster.   

The obvious benefits of the DR plan are:  

  • Reduction of financial losses resulting from downtime  

  • Seamless data restoration process  

  • Bottom line employees know what to do in case of a disaster  

However, none of these benefits will be realized if you don’t test your DR plan. 

What Is Disaster Recovery Testing?

After you have created a DR plan, you should find out whether the plan works or not. Proper disaster recovery testing helps you avoid any rude surprises that may occur when a disaster strikes. Even if you think you have the best DR plan in the galaxy — test, test and test again.  

Smart businesses have started adding disaster recovery testing as part of their overall DR plans. The disaster recovery test plan outlines the various readiness and recovery tests with corresponding steps and procedures. The main objective of a DR testing plan is to verify that a business can restore operations within a predetermined timeframe. It gives you foresight on the way your IT infrastructure behaves if any part (or all) stops operating.   

Why Is Disaster Recovery Testing Important?

When Fergusson Medical Group in Missouri was hit with ransomware, they followed every step according to their DR protocol: activate recovery plan, secure network and notify law enforcement. But for some reason, three months of patient data from the previous year could not be recovered. In hindsight, disaster recovery testing could have saved the organization from losing more than 107,000 patient records.  

Test your disaster recovery plan before giving it the title of “The Most Incredible DR Plan Ever.” It’s a proactive approach to identifying and fixing inconsistencies in your DR strategy. As a result, businesses enjoy plenty of benefits. 

Avoid long hours of downtime

An unexpected, long restoration period leads to long hours of downtime, which costs businesses big. Depending on the size, downtime can cost the business anywhere from $10,000 per hour for smaller businesses to more than $5 million per hour for big enterprises. 

The problem isn’t so much the delay as it is the complete disconnect between the expected recovery time (as mentioned in the DR plan) and the actual time of recovery. A lapse in recovery time can have a reverberating effect across the business.    

Disaster recovery testing is a foolproof way of ensuring that businesses get operations and restoration up and running promptly when disaster strikes 

Save your reputation

This is an extension of the previous point. Usually, you would commit a standard recovery time (as mentioned in the DR plan) to clients during a disaster. A failure to deliver on what you promised is damaging to your business reputation. Furthermore, anything that damages your reputation will also spike your churn rates.  

Be compliant

A DR plan is no longer a convenience, it’s a requirement. In sectors like healthcare, finance and government agencies, they need to follow strict compliance standards like HIPPA and FINRA that demand a disaster recovery plan with a specified uptime. Disaster recovery testing helps you be compliant and avoid penalties and legal fees that come with being non-compliant. 

How Often Should a Disaster Recovery Plan Be Tested?  

The frequency of your testing depends on the nature of your business since high turnover rate, rapid process changes or new regulations come into play. That being said, testing your DR plan yearly is a minimum recommended cadence. In a recent survey, researchers found that only 44 percent of businesses had a disaster recovery plan in place and, of those, only 31 percent ran tests at least once a year. 

Disaster Recovery Testing Scenarios

DR testing scenarios businesses need to consider are: 

Human error

Human Error is the primary cause of IT downtime. According to a survey from ITIC, more than 49 percent of IT downtime occurs due to human errors. Disaster recovery testing is a good way to gauge employee understanding of standard policies. Do they have the right knowledge about what to do during a disaster? DR testing provides a clear answer which can help you determine whether you need to reinvest in employee training or not.   

Software/Hardware failure

IT downtime can happen because of outdated software or aging hardware. Patching up outdated software or replacing obsolete hardware after a disaster significantly adds more hours to the downtime clock. DR testing flags you on software and hardware that are hanging by a thread and pushes you to take action before an actual disaster occurs. 

Network failure

Firmware upgrades, bugs, hard disk platter damage, power supply glitches and faulty RAM are just some of the common reasons for network instability. Testing for network preparedness is the only way to ensure that you will be able to resolve network issues when they occur. Apart from technology, run testing for disaster recovery scenarios that involve network administrators to see if they know what to do during a real disruption. 

Internet and power outages

Certain sectors have zero tolerance for downtime. For instance, healthcare institutions cannot be at the mercy of the utility provider to restore power.

Create protocols like the ones mentioned below to act swiftly and know exactly what to do when an outage occurs. But do test and review these protocols from time to time.  

  • Assessing if the outage is specific to the building or widespread   

  • Communicating with the utility provider to get ETAs for resolution   

  • Deploying backup power sources and ensuring they’re working properly   

  • Prioritizing critical services and personnel who will access backup power services 

Cybersecurity threats

Cyberattacks are the Baba Yaga of IT disasters, ferociously disrupting your business operations. The motivation behind a cyberattack might be tactical or just plain sadistic. Irrespective, it can cripple a business if left unchecked. Cyberattacks like ransomware cause an average of 16.2 days of downtime. To mitigate the impact of data loss, keep a copy of your data and restore it in case the original data set is lost. However, you need to be certain that you can restore additional copies of your data, which is possible only through testing.   

Natural disasters

Natural disasters, such as tornadoes, hurricanes or earthquakes destroy physical infrastructure causing network failures and business shutdowns. Map out the geographical position of your production and identify the possible risks. Take this information into consideration for DR testing scenarios. For instance, a DR testing scenario for a hurricane damaging your communication infrastructure. 

Disaster Recovery Testing Methods

Here are different ways to test your disaster recovery plan:  

Plan review

IT managers and top-level management together review the plan. The components audited tend to be contact information, recovery coverage for desired business continuity, validity of recovery contracts and training material for new members of the DR team.  

Walk-through drill

Team members physically demonstrate the steps expected to be taken during a disruption. This can include moving to the right backup location, choosing the communication methods mentioned in the plan and contacting the right personnel. The test records validation of team response and DR processes.  

Sandbox

With the help of third-party companies that offer disaster recovery as service (DRaaS) solutions, you can “sandbox” or partition virtual machines so testing can be performed without affecting production servers. 

Guaranteed Recovery Against Any Disruption

It’s almost impossible for businesses to manually test DR processes with the increasing frequency and versatility of disruptions in today’s environment. Whether it’s a monkey or a tornado, businesses of all sizes run the risk of unplanned downtime due to business disruptions or recovery failures.  

Unitrends Recovery Assurance delivers automated recovery testing, both onsite and offsite, with the ability to set recovery objectives and SLA s ahead of time — giving you peace of mind when it comes to ensuring business continuity.

 

 

MARKET-LEADING BACKUP AND RECOVERY SOLUTIONS

Discover how Unitrends can help protect your organization's sensitive data