When setting up DR plans are you covering all your potential options? Are you thinking about the entire location of your production site being unavailable? All to often when I talk to companies about their DR plans it takes me all of about 5 seconds to poke some major holes in the DR plan. Maybe this is because I’m from California and when we have a disaster (Earthquake) we really have a disaster.
So think about your DR plans for a minute. They are probably built around the assumption that the office and/or the data center (if the data center isn’t in the office) don’t exist any more. Right? So what kind of disaster have you planned for?
- Building fire
- Power Outage
- Roof Gives Way
OK, that’s a decent list. But I’m going to say that you probably have only planned for 3 of those 6 that you think you’ve planned for. Can you figure out which ones I think that you’ve planned for?
I’ll give you a minute.
Take your time.
If you guessed Building File, Power Outage and Roof Gives Way you are correct, but you only get 1/2 credit for Power Outage.
The reason that I don’t think you’ve actually planned for an Earthquake, Flood or Hurricane is because what sort of planning around people have you done. Many companies have the bulk of the IT team in a single site with maybe just a couple of help desk guys and a sysadmin at which ever office is the secondary site. When it comes time to execute the disaster recovery plan what staff members are you planning on having available? Are you planning on having the IT staff from your primary site available? If the answer you gave way “yes” then you answered incorrectly.
When a disaster hits that’s big enough to take out our primary site the odds are that everyone who works at the primary site probably won’t be available either.
If there’s an Earthquake that’ll effect the entire region. Even if some staff members life far enough away that they aren’t impacted how are they going to get to the office. The roads probably won’t be passable, the power will probably be out, the airports will be closed, trains won’t be running, etc.
The same applies to a flood. Think back to the flooding in New Orleans a few years ago. How many of the companies there expected their employees to be available to execute the DR plan? The employees were busy evacuating their families, odds are the company was the last thing on their minds. The same rules apply for a hurricane.
When planning for power outages, how large scale of a power outage are you planning for? Here in the US the power grid is very unstable and it’s pretty easy to crash the entire grid. Hell it’s happened a couple of times now in the last few years. That again means that airports are going to be having problems, trains will be slow or down, roads will be a mess, users won’t have power to VPN into the office, etc. And again the staff members will be busy worrying about their family members being secure, fed and warm, not worrying about getting the company failed over to the DR data center.
While you probably don’t need to plan for every possible disaster (the odds of the zombie apocalypse are pretty small) there needs to be a lot of thought put into the plans to account for when the staff will and won’t be available. And the DR tests need to take this into account. Doing a DR test and having the staff at the primary site who manage the systems normally doing the test isn’t the best test as those people won’t be available for a real disaster.
The best plan is to have the DR test done by the people who don’t work at the primary site, and have them do it without contacting the people from the primary site. That’s a proper DR test. And I think you’ll be surprised just how quickly that sort of DR test fails. After all, do the people who don’t work at the primary site even have access to the DR run books without accessing the primary site?