Back everything up, back it up again then back it up again for good measure—that’s what most IT professionals have made their rallying cry. Yet, in my opinion, backups can be useless in a lot of disaster recovery situations without the right recovery strategy.
Yes, you heard me right. The sense of security that your SQL backups give you might be ill-placed should things really hit the fan. Here’s why your disaster recovery plan needs more than just SQL backups, and how to recover data properly and painlessly as possible.
The Brutal Truth About Your SQL Backup
Backups are an important part of disaster recovery, but they’re just that—a part. Think of a SQL backup as a picture of one set of data, say your order processing or your CRM system. When a critical outage causes that system to stop working, you have a (hopefully) recent snapshot of what things looked like just before the outage that you can revert to.
The problem is that data didn’t exist in a vacuum. It interacts with and informs all sorts of other systems within your company. Let’s return to our manufacturing example: your CRM system logs orders and payments, and each order instantaneously prompts a logistics or production action upon creation of an order. If your CRM experiences an outage and restores to a backup created before the order was logged, that order no longer exists in the CRM. But a ticket/message for the order has already been created within your production system. So, a product is being produced (and costing you money) when there is no record of the order.
Maybe this item simply becomes a mysterious ‘extra’ in a warehouse, or maybe—due to synchronicity between the production and logistics arm of your organization—that item is shipped off to a customer who ordered the item but hasn’t paid. Even if they had paid, and simply had a problem with their order, any attempt to talk to someone at your organization about the order would be thwarted by the fact that this order simply didn’t exist on the backup data that is now being used.
The potential for headaches is not insignificant. But we can show you how to restore your system to avoid these issues.
What Should My Disaster Recovery System Look Like?
A truly useful disaster recovery system is more than just what you need to recover, but also how it needs to be done. And given the unique and often complicated data systems at any organization, the only way find out the latter is to test it.
Our checklist for helping organizations create a truly robust disaster recovery system includes the following:
1.Creating an overview of systems and interfaces
Want to know how to restore your system without losing anything critical? Then you need to know where systems interact with each other. If you can accurately chart where data flows and processes start, then it will be easy to follow this roadmap in the event of a disaster.
2. Identify system backup/restore methods and implement automation
The easiest way to know how your system will react to disaster recovery is to test it. That means identifying the current system backup and restore methods, then creating a regular automated test of the system’s function. It’s not enough to test this once, as true peace of mind only comes through regularly performing test backup/restores to prove backups are valid and restores work in all conditions. (This is a functional test of your restore process.)
Using the above, we can then understand how the different systems interact at restore time. This means deciding what interfaces need pausing, what numbering systems collide (e.g. Invoice Numbers may need resetting to avoid double invoice number assignment). It also means outlining which business processes are involved: CRM, Accounting, Manufacturing, Warehouse, Material Supply Chain management, etc.
3. Create a framework and/or a set of “runbooks”
A runbook is a detailed “how-to” guide for completing a repeated task or procedure in your organization’s IT operations. A runbook gives you a set of instructions to follow to recover from a disaster without any guesswork. Creating—and practicing—a runbook reduces the chances of errors or forgetting steps.
4. Run regular outage tests.
Just like firefighters do drills, so should a company. Practicing these steps make it less stressful when things really go wrong. But unfortunately, practicing outage tests is the most commonly “forgotten” step in the initial process. Often this is because it costs time and effort and distracts from the day-to-day work. However, it really pays off when an issue arises and you can resolve it quickly.
This is not an exhaustive list because each data environment will have its own unique needs and considerations. But it serves as a good overview of some of the points that you may not be taking into account with your current disaster recovery system.
Do you need your disaster recovery system evaluated? We can help you out—just reach out to our team at info@datamasterminds.io . We can help ensure that the next data disaster doesn’t derail your day.