With Microsoft Azure now supporting Virtual Machines with NVMe storage; things get a little different when it comes to handling recoverability. Recoverability becomes very important because NVMe storage in Azure isn’t durable through reboots. This means that if your shutdown the server, or there is a host problem, or the VM host has to be patched and rebooted than anything on the NVMe drive will be gone when the server comes back up.
This means that to keep data on the VM past a shutdown you need to think about high availability and disaster recovery.
You need to have High Availability built into the solution (with Availability Sets or Availability Zones) which probably means Always On Availability Groups to protect the data. The reason that you need to have Availability Groups is that you need to be able to keep the data in place after a failover of the VM. When the VM comes back up, you’ll see the server is up, but it may not have any data. So what needs to be done at this point? You need to create a job on every node that will automatically look to see if the databases are missing and if they are then remove the databases from the AG, drop the databases, and reseed the databases from the production server.
Because of the risk of losing the data that you are protecting, you probably want at least three servers in your production site so that if one server goes down, you still have redundancy of your system.
You need to have Disaster Recovery built into your solution as well as high availability. Because of the risk of losing data if a set of VMs fails you need to plan for a failure of your production site. The servers that you have in DR may or may not need to have NVMe drives in them; it all depends on why you need NVMe drives. If you need the NVMe for reads then you probably don’t need NVMe in DR; if you need NVMe for writes, then you probably do need NVMe in DR.
While a full failure of your production Azure site is improbable, it is possible, and you need to plan for it correctly.
If you have NVMe in DR, then you’ll want to the same sort of scripts to reseed your databases in the event of a SQL Server restart.
But this is expensive
Yes, it yes.
If the system is important enough to your business that you need the speed of NVMe drives, then you can afford the extra boxes required to run the system probably. Not having HA and DR, then complaining that there was an emergency and the system wasn’t able to survive won’t get a whole lot of sympathy from me. By not having HA and DR you made the decision to have the data go away in the event of a failure. If these solutions are too expensive, then you need to decide that you don’t need this solution and that you should get something else to run the system.
Sorry to be brutal, but that’s the way it is.
Contact the Author | Contact DCAC