According to International Data Corporation’s Digital Universe Study, in 2012, less than a fifth of the world’s data was protected, despite 35 percent requiring such actions. Levels of data protection are significantly lagging behind the expansion in volume.
Over the years, I’ve read and heard many different ways to get data and data centers ready for recovery. While leveraging cloud services, using “hardened” storage devices and synchronizing NAS devices are all good ideas, they are all missing one very important fact: the prep work is the most important part of recovery.
This may come as a shock, but backup is completely worthless if you cannot recover your data. And, archiving data is completely useless if the data cannot be retrieved in the time you need it.
Sounds like a stupid statement, right? You’d be surprised. I have talked with far too many organizations that buy data protection products and are never quite sure if their data will be recovered or not. All the bells and whistles are not worth the money if you cannot get your data back quickly and easily.
More than ever, robust data protection is imperative to recovery in the event of data loss. In fact, failure to safeguard company data can result in business disruption, devastating losses, and in some cases, catastrophic consequences to the business. Numerous reports and studies show that businesses that go through critical data loss often never recover.
Below are the four steps organizations should take to disaster proof their data.
Tiering comes first
This first thing every business needs to do is plan. That’s right, plan. The classification of data is the most important part of all disaster recovery planning procedures. Planning for the worst means that you have to decide which data is important and which data needs to be recovered first.
Tiering your data helps align the value of the data with the cost of protecting it. It helps stretch your backup budget and makes data protection and recovery more efficient. The recovery point objective (RPO) and recovery time objective (RTO) should vary for each application and its data, and all data should not be treated in the same way during backup and recovery procedures. All data is not created equal.
This tiering of data is not simple as it requires many different parties to agree on which data is most important. For example, any data that is historical (not used on a daily or monthly basis) should be the lowest tier. This data needs to come back after a disaster but not until everything else is up and running. Perhaps tier one data stays on disk for fast restore.
Having three or four tiers of data breaks up the recovery into manageable parts, allowing the recovery process to be more focused. An example of how to prioritize the tiers of data:
- Tier 1: Data that’s essential to your daily operations and/or highly confidential.
- Tier 2: Data that only needs accessed from time to time.
- Tier 3: Information you rarely access and that is being stored until your data retention date is met and it can be destroyed.
Replication is a key component of disaster recovery. The technology often works in combination with deduplication, virtual servers or the cloud to carry out its role in recovery. Copying data from a host computer to another computer (at a remote location) establishes redundant copies and ensures business continuity in the event of a disaster. When data replication is done over a computer network, changed data is copied to the remote location as soon as it changes.
Data needs to exist in at least two places at all times: onsite for near-line recovery and offsite for disaster recovery. The cloud is limited in the amount of data it can ingest on a daily basis and is even more limited on the amount of data it can recover. Placing data offsite to another raised floor or location, or even shipping tapes to a secondary site, is the only way to recover massive amounts of data.
In a disaster, the data center or backup technology may be gone or no longer functioning. Replication enables IT staff to restore the data that is stored in the data center on the given media in order to continue business operations. Without this data, organizations have no historical data to process. Having a copy of the data enables the business to pick up close to where it left off when the disaster struck.
The timing is important. It is necessary to understand how much data you have and how long you have to recover it. Once these two items are known, planning the mechanics of the recovery can really begin. If 10 terabytes (TBs) of data have to be recovered, but connections will only push 5 TBs per day, it will take you two days. If two days is too long, then increased bandwidth must be purchased.
Test, test and test again
Disaster recovery is no exception to the old adage, “practice makes perfect.” Testing is the only way to verify that all of your plans will work in the event of an actual disaster. If you do not test at least once a year, then you have no idea what your timing looks like or if you can even recover your data. You have to have an understanding of the weak spots or failure points.
It may seem simple, but just try it. Declare a disaster at your organization and see how fast you can line up the right resources to pull off the recovery. See how fast all parties could get to your recovery site. You do not even have to do the recovery. Perform this simple assessment, and you will start to see the need for thorough testing.
Truth be told, you cannot disaster proof your systems, however you can plan for the worst. Planning and testing are the most important parts of “disaster proofing,” but the testing can never end.
Originally posted at: Data Center Knowledge