SQL Server Replication and AlwaysOn Availability Groups

Published On: 2021-04-12By:

Using SQL Server Replication with AlwaysOn Availability Groups is a pretty straightforward process at this point. Having the publisher be a member of an Availability Group has been supported for years now. You can however run into trouble when trying to create a new replication publication on a server that isn’t the server that you were using as the Primary server when you set up the first publication. When using a different server in the availability group you’ll get an error back from SQL Server Management Studio which says “Could not find server {old primary server} in sys.servers. Verify that the correct server name was specified. If necessary, execute the stored procedure sp_addlinkedserver to add the server to sys.servers. Error 7202”.

Let me explain what is happening, and how to work around it.

sql1a – Server that was the active replica when the initial publication was setup

sql1b – server that is now the active replica

If you try to set up a new publication on sql1b, you’ll get the error message that is described above. If you write out the replication commands manually you won’t get this error. This error only happens within the GUI.

The easy solution is to fail the Availability Group back to sql1a and create the publication. But what if that isn’t an option because that server no longer exists? The easy solution here is to create a linked server on all the servers in the AG, using the name of the server that was used to set up replication initially, in this case, sql1a. I then configure this linked server to point to the Availability Group listener name. So that way the linked servers, on any server, will always point to the active server. If you are setting up replication you are on the active server already, so the linked server just points back to itself.

By setting up a linked server with this name, that points back to the listener this allows replication publications to be created which sql1b is the active replica within the Availability Group.

It’s a quick and easy fix that’ll get you past the error message that SQL Server Management Studio is throwing.

Denny

Contact the Author | Contact DCAC

Using Azure Site Recovery for VM DR, and a SQL Managed Instance Failover Group

Published On: 2021-04-06By:

I wanted to name this post “I built NAT between Azure Virtual Machines, this makes me an Azure Networking God”, but that title seemed a little wordy. I was working with a client recently with an interesting Azure setup and some interesting requirements. The database tier for this application is using Azure SQL Managed Instance and will be part of a Failover Group which allows the SQL Managed Instance to be failed over between regions as needed. The web and application tier of the application will be within Azure Virtual Machines which will be protected with Azure Site Recovery (ASR).

For Disaster Recovery Tests and for an actual Disaster Recovery Failover we wanted to have to make as few networking changes as possible.

The Setup

In general, everything was pretty easy to get set up. We setup ASR, and ASR created an Azure virtual network (vNet) automatically as part of the Replication Setup process. We then set up our production Managed Instance and connected it to the production vNet. When we went to set up the Disaster Recovery SQL Managed Instance we ran into the root of our problem. In order to put the two Managed Instances within a Failover Group so that they could move data between them, the Managed Instances needs to be able to talk to each other. But the vNets had the same IP space, which means that we couldn’t peer the networks, as one of the requirements for connecting vNets to each other is that the networks need to have different IP Spaces.

In order to get around this requirement we configured a separate Azure vNet for the Disaster Recovery Managed Instance, which we could them peer to the production vNet. This then allowed the Production and the Disaster Recovery Managed Instances to be able to talk to each other.

The Big Problem

The next requirement was that the Disaster Recovery Virtual Machines needed to be able to connect to the Disaster Recovery Managed Instance with as few network changes as possible. We couldn’t just peer the ASR-created vNet to the Disaster Recovery Managed Instance vNet, as we’d have an IP conflict as the Disaster Recovery Managed Instance vNet can’t connect to two vNets with the same IP ranges. The solution to this was to create an additional vNet, with a machine in that vNet with the Windows Routing and Remote Access Service (RRAS) installed on it. This gives me a total of 4 Azure vNets peered together as shown in the picture below.

In order to make Routing and Remote Access Services (RRAS) happy it required having two subnets within the NatvNet and that machine needed to have two vNICs, with each vNIC in a separate subnet. I set up two Subnets in this vNet which were 192.168.21.0/26 and 192.168.21.64/26. The first vNIC gets the IP address of 192.168.21.4 and the second vNIC gets the IP address of 192.168.21.68. When RRAS is setup you have to specify an IP subnet range that will be used for the NAT. This range should be part of the IP range of the second vNIC. In this case, I used the range of 192.168.21.70-192.168.21.75. In order to make Azure happy (I’ll explain why in a minute) I needed to add the IP addresses that were being used for the NAT to the vNIC that was in the same subnet that they were from.

The final change in Azure that needed to be made was to create a routing table that directs the network traffic to the 192.168.20.0 network and use the next network hop for that being 192.168.21.4. This routing table is then applied to all the Subnets within the vNET that ASR created (called ProductionAsr in the image above). The network design and the VMs layout is shown below and should give a good view of the design.

The IP Requirement

One of the things to keep in mind that Azure network is a bit stubborn. When IPs are being used and those IP addresses aren’t known to the Azure environment, then the Azure environment doesn’t know which physical machine on the network to send the traffic to. I could easily see this in Wireshark when I ran Wireshark on the NAT server (the machine in the NatvNet subnet that was running.

I was using a machine in my ProductionAsr network with an IP address of 192.168.4.5. When I accessed 192.168.20.4 (the IP addresses of my Managed Instance) and looked at the network traffic (without having the IP addresses 192.168.21.70-192.168.21.75 assigned to the vNIC) I could see the network traffic going from 192.168.4.5 to 192.168.21.4, and I could see the network traffic going from 192.168.21.68 to the Managed instance IP address of 192.168.20.4, but the network traffic wasn’t coming back from the Managed Instance.

I did some additional testing by putting a Virtual Machine in the Managed Instance network and gave it an IP address of 192.168.20.5. When I installed Wireshark on this machine I could see the traffic coming to the machine, and I could see the machine sending the traffic back to 192.168.21.68, but it was never arriving at that machine. I noticed that it wasn’t actually using 192.168.21.68, but it was instead sending the traffic from the NAT machine to the Managed Instance network using 192.168.21.70. So on a hunch, I added 192.168.21.70 as an IP address which the server on the NatvNet could use. About 15 seconds after the Azure Portal said that it had made the change, I was able to make my connection to the Managed Instance as I needed to.

Final Configuration

The final setup of this was a bunch of Virtual Machines in a vNet. These machines were all protected with ASR. For the database we’re using an Azure SQL Managed Instance which is setup in a Failvoer group for Disaster Recovery.

With this NAT configuration we were able to have our SQL Managed Instance data movement between the two instances, while allowing the production servers to access either SQL Managed Instance. The Disaster Recovery Virtual Machines were able to access the SQL Managed Instance when doing a DR test, or in a DR failover. In the DR test requires that we need to be able to write to the DR Managed Instance we can do this by simply removing the DR Managed Instance from the failover group and putting a host file entry so that the production server name resolves to our DR Managed Instance IP address. This connects all the Disaster Recovery VMs to the Disaster Recovery Managed Instance with just a couple of simple changes.

A Failover Event

For an actual failover event the failover process is even easier. The VMs are failed over using ASR. The SQL Database Managed Instance failover group is failed over. And that’s is. At that point everything is failed over and working.

If you’d like to talk more about setting up your cloud environment, we’d love to set up a meeting with you.

Denny

Contact the Author | Contact DCAC

Azure Site to Site VPN to a Cisco ASA v8.6+

Published On: 2021-03-22By:

Recently I was moving a client from their existing CoLo to Azure. Right now they are going with IaaS for their web servers and PaaS (Azure SQL Managed Instance) for their databases. This means that we needed to set up a Site-to-Site VPN between their existing on-prem environment at the CoLo and their Azure vNet so that they could transfer everything over, replicate AD, etc. Their on-prem router was a Cisco ASA running version 8.6 of the Cisco IOS.

Our plan was to enable BGP on the connection as we had several vNet’s that we wanted to have connected to the VPN, and we only wanted one VPN gateway. However, with a Cisco ASA, we found that BGP wasn’t an option as Azure has a new feature available within the configuration of the connect called “Policy Based Traffic Selector” which required that BGP be disabled.

Site-To-Site VPN Connection Properties in the Azure Portal

The problem that we were facing with our VPN connection was that even though all of the IKE Phase 1 and IKE Phase 2 settings were connect (we verified them against the Cisco ASA settings probably 10 times) we were still getting protocol mismatch when the VPN connection would attempt to come online.

We eventually tried turning on the “Use policy based traffic selector” option on the connection (which made us disable BGP) and the connection came up on the first try. For this, we were using a policy-based VPN Endpoint, and these settings are pretty new (it’s March 2021) as I don’t remember seeing them on previous VPN connections.

As you can see from the screenshot above, we specified the IKE Phase 1 and IKE Phase 2 settings, but this doesn’t need to be done as using any of the settings that Microsoft has will work, but we wanted to take the guesswork out of the migration process.

The end result was one of success, the tunnel came up and we could set up things like AD Replication so that we could continue with the migration process.

Denny

Contact the Author | Contact DCAC

TDE, Azure SQL MI, and Failover Groups

Published On: 2021-03-09By:

Transparent Data Encryption (TDE) with a Customer Provided Key, Azure SQL Managed Instance (MI), and Failover Groups should all be easy to setup. Azure SQL MI with either of the other two is pretty straightforward to setup. All three of them at the same time require a little doing in order to get them all working correctly together. This is because you as the admin need to manually move the key which TDE is using to encrypt the databases to the secondary region manually, and you need to tell the Azure SQL MI in the secondary region where to find the key.

Having a customer-managed key for TDE requires some special setup to ensure that the key is available to both instances. This is done by placing the key within Azure Key Vault. In order to set up a failover group you need to have not just the Azure Key Vault in the primary site, but you also need to set up one in the secondary site which will be hosting your Azure SQL MI.

Once you have an Azure Key Vault in both sites, you can use the GUI to export the key from the primary site and restore it to the Azure Key Vault in the secondary site. After that is done you’ll need to give the Azure SQL MI in the secondary site access to the key, which is done through the Access Policies section of Azure Key Vault. The managed instance needs three permissions to access keys; which are Get, Unwrap Key, and Wrap Key. No other rights need to be granted to the MI.

To grant rights to the Managed Instance click the Add Access Policy link.

Then select the three rights needed in the Key Permissions drop down.

Then click on “None Selected” next to “Select principal”. This will bring up a list of all users and applications which are registered with Azure Active Directory. Find the Managed Instance for the secondary environment and click “Select” at the bottom of the screen, then click “Add” on the prior screen (where you selected the permissions).

Once this is done, go into the setting for the Managed Instance in the secondary site, and select Transparent data encryption from the menu. On this screen change the setting from “Service-managed Key” to “Customer-managed Key”. Then either select a key or enter a key identifier of the key which will be used for new databases from the key vault (if all your databases are protected with one key, just select this key).

Save these settings changes and edit the properties of your Azure SQL MI in the primary site. On this server select Failover Groups and click the “Add Group” button at the top of the blade that opens. Select all the correct information for the secondary instance and the group name and save the settings.

It’ll take a while, but when it is done the map should look something like this (depending on what regions you selected of course). Once you’ve gotten to this point you’ve set up your Azure SQL Managed Instance using TDE with a Customer-managed Key and a Failover group.

You should have a beer at this point, you’ve earned it.

Denny

Contact the Author | Contact DCAC
1 2 3 367

Video

Globally Recognized Expertise

As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.

And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.

Awards & Certifications

Microsoft Partner   Denny Cherry & Associates Consulting LLC BBB Business Review    Microsoft MVP    Microsoft Certified Master VMWare vExpert
INC 5000 Award for 2020    American Business Awards People's Choice    American Business Awards Gold Award    American Business Awards Silver Award    FT Americas’ Fastest Growing Companies 2020   
Best Full-Service Cloud Technology Consulting Company       Insights Sccess Award    Technology Headlines Award    Golden Bridge Gold Award    CIO Review Top 20 Azure Solutions Providers