I’m writing this post because I’ve been mired in configuring a bunch of distributed availability groups for a client, and while the feature is technically solid, the lack of tooling can make it a challenge to implement. Specifically, I’m implementing these distributed AGs (please don’t use the term DAG as you’ll piss off Allan Hirt, but more importantly its used in Microsoft Exchange High Availability, so it’s taken) in Azure which adds a couple of additional changes because of the need for load balancers. You should note this feature is Enterprise Edition only, and is only available starting with SQL Server 2016.
First off why would you implement a distributed availability group? If you want to implement a disaster recovery (DR) strategy in addition to a high availability strategy with your AG. There’s limited benefit of implementing this architecture if you don’t have at least four nodes in your design. But consider the following design:
In this scenario, there are two data centers with four nodes. All of the servers are in a single Windows Server Failover Cluster. There are three streams from the transaction log on the primary which is called SQL1. This means we are consuming double the network bandwidth to send data to our secondary site in New York. With the distributed availability group, each location gets its own Windows Cluster and availability group, and we only send one transaction log stream across the WAN.
This benefits a few scenarios–the most obvious being, it’s a really easy way to do a SQL Server upgrade or migration. While Windows clustering now supports rolling OS upgrades, its much easier to do a distributed AG, because the clusters are independent of each other and have no impact on each other. The second is that its very easy to fail back and forth between these distributed availability groups. You have also reduced by half the amount of WAN bandwidth you need for your configuration, which can represent a major cost savings in a cloud world or even on-premises.
If you think this is cool, you with smart people–this is the technology Microsoft has implemented for geo-replication in Azure SQL Database. The architecture is really robust, and if you think about the tens of thousands of databases in Azure, you can imagine all of the bandwidth saved.
That’s Cool How Do I Start?
I really should have put this tl;dr at the start of this post. You’ll need this page at docs.microsoft.com. There’s no GUI. Which kind of sucks, because you can make typos in your T-SQL and the commands can still potentially validate and give you non-helpful error messages (ask me how I know). But in a short list here is what you do:
- Create your first WSFC on your first two nodes
- Create an Availability Group on your first WSFC, and create a listener. Add your database(s) to this AG
- If you are in Azure, ensure your ILB has port 5022 (or whatever port you use for your AG endpoint) open
- Create your second WSFC on the remaining two nodes
- Create the second AG and listener, without a database. In case you really want to use the AG wizard, add a database to your AG, and then remove it. (Or quit being lazy and use T-SQL to create your AG)
- Create the distributed AG on the first AG/WSFC
- Add the second AG to your distributed Availability Group
This seems pretty trivial and when all of your network connections work (you need to be able to hit 1433 and 5022 from the listener’s IP address across both clusters). However, SQL Server has extremely limited documentation and management around this feature. The one troubleshooting hint I will provide is to always check the error log of the primary node of the second AG (this is known as the global forwarder), which is where you will see any errors. The most common error I’ve seen is:
A connection timeout has occurred while attempting to establish a connection to availability replica ‘dist_ag_00’ with id [508AF404-ED2F-0A82-1B8A-EA23BA0EA27B]. Either a networking or firewall issue exists, or the endpoint address provided for the replica is not the database mirroring endpoint of the host server instance
Sadly, that error is a bit of a catch all. In doing this work, I had a typo in my listener name on the secondary and SQL Server still processed the command. (So there’s no validation that everything and connect when you create the distributed AG). I’m sure in Azure this is all done via API calls, which means humans aren’t involved, but since there is no real GUI support for distributed AGs, you have to type code. So type carefully.
Overall, I think distributed availability groups are a nice solution for high available database servers, but without more tooling there won’t be broader adoption, and in turn, there won’t be more investment from Microsoft in tooling. So it’s a bit of a catch 22. Hopefully this post helps you understand this feature, where it might be used, and how to troubleshoot it.
to truly use this feature as a DR solution I would have liked to be able to have a Global Traffic Manager in front and access the system using the GTM. The GTM would point to the appropriate AG listener. However, this would not work here because both AG listeners IP addresses would appear as healthy and online.
My question: how do others implement such a solution? do they require modifying the client connections after a distributed AG failover?
I’ve tried two approaches. The first was to create two listeners for each AG. The first listener in each AG is not used by the Distributed AG and is used by the application to connect to the AG. The second listener in each AG is used by the Distributed AG to sync traffic. Then on failover between regions I updated the name of the first listener, using wsfc manager, in the now secondary region to “lsr1_old” and updated the listener in the now primary region to match that of the application connection string i.e “lsr1”. This worked most of the time but on occasion caused corruption of the cluster and was a nightmare to manage at the AD DNS level.
The second method and the one which is currently in place was too use the standard single listener in each AG. Then create two (A) host records in DNS. Each (A) host record resolved to one of the listeners. Then I created a CName record which aliased to one of the (A) host records. The application connects to the CName record. Then on failover between regions it’s just a case of changing the (A) host record which the CName record aliases to and the application will route to the other side. The downside with this is that you have to wait for the change to be replicated to all DC’s in your environment.
Some other options I considered are to update your application connection strings at fail over time. Or go with the two listener option and drop the first listener in the previous primary region and create the first listener in the now primary region at failover time. Neither solution I liked.