This is clickbait post title, sorry. You are here now. The correct answer is that you should purchase MySQL as a database service from your favorite cloud provider (Google, Amazon, and Azure all offer prebuilt database as a service offerings) because they have gone through the trouble of making their solution highly available. I’ll speak to Azure, because I’m familiar with the platform–Microsoft doesn’t employ MySQL High Availability per se, however both the storage and the VM are highly available. If there is a disk failure, the service is not impacted, and if there is a failure in the compute tier, a new VM is provisioned.
My second recommendation, if you really, really want to build your own thing is to build a Windows Server Failover Cluster, and used shared storage. Make the MySQL service a clustered resource, and assign a floating IP to the service that will fail with it. (Yes, I know you have to pay M$ for the Windows Server licenses).
Why shouldn’t you use an open source solution to make your MySQL database highly available? First let’s look at a picture of a common MySQL high availability architecture:
If we think about what we need a clustering solution to provide it comes down to a few things:
Providing a floating IP address to allow connection to the primary
Check the health of the database services and initiate a failover in the event one of them isn’t healthy
Executing a clean database failover and providing the ability to easily fail back
Ensuring the stability of the overall cluster, maintaining quorum, and avoiding split brain scenarios
If you are using a shared storage scenario, the clustering solution needs to manage the shared storage to coordinate failover with services.
If you are using SQL Server with Windows Server Failover Clustering, the cluster service takes care of all of the above, and more. When you look to do this on Linux for MySQL that there about 10 different sets of components you can use to make the service highly available. At the basis of all of these solutions is MySQL replication it’s pretty trivial transactional replication. MySQL’s replication service is fairly robust, and the GTID implementation is pretty solid.
The problem is that the rest of the components are all mix and match. You could use Haproxy to float the IP address, but there’s no way to do a smart health check of the database. It simply does a port connection test. Which means, if your primary goes away, and then comes back without some advanced configuration your floating IP is going to fail back to the original primary whether it’s actually the primary in your replication pair. This is but one example–you are going to end up with 3 or 4 different components to execute each of these functions, and congratulations you are in charge of a complex distributed system that you are responsible for administering for the rest of your life.
But Joey, Facebook/Google/Pick You Other Favorite online megacorp run MySQL and they support it with 5 9s. Ok, sure, I don’t disagree with this–and as databases, MySQL and PostgreSQL are generally ok. But look around at your engineering staff–wait do you have engineering staff? If you don’t have a few people who have both really good Linux SA skills and DBA skills, you are going to be pretty quickly in situation where support is a challenge.
Finally, consider if you need an HA solution. Are you running on a virtual machine? As long as your infrastructure is solid, that probably gets you to about 99.5% availability on a bad week. What you absolutely want to avoid is the Windows 2000 paradigm, which is where your high availability solution incurs more downtime than a standalone system.
My day to day work machine is a Windows 10 HP laptop. It remains docked at my desk where it happily helps me to go about my daily business. However, my backup machine, is an Apple MacBook Pro. It’s small footprint and ease of use allows me to easily shift working locations if needed. Of course, these days I’m maintaining social distancing so getting away from my desk to work elsewhere means working from my kitchen table for a change of scenery.
When I shift working locations I work on my Macbook, and I still need access to a SQL Server instance. Yes, I could install SQL Server on the OS but it’s my personal machine and I’m not that familiar with SQL on Linux (yet). Docker, however, it a great solution to allow me to run SQL Server containers on my Macbook without having to set up anything. Let’s look to see how easy it is to setup Docker.
Go out to docker.com, download, and install the Docker Desktop application. This is what will allow you to run container images. The images are what define the container which then in turn runs the application, which in our case is SQL Server. Once you have it installed, you’ll have a dashboard like the one shown below.
At this point, we don’t have any containers running so we need to first get a container image and then create a container. I got my image from Docker Hub, a library of container images that you can browse and subsequently download if needed. Searching for SQL Server images, we can see that Microsoft has their own registry that has a few available container images. The Feature Tags shown below show some often used images. If you scroll further down the page, you can browse the full is list of available tags that can be used to download an image.
For my purposes, using the latest and greatest for testing and validating things is perfect so I’m only interested in the “2019-latest” container image. To obtain the image, you can issue a pull command from within a terminal session. You’ll notice that Docker Hub provides you with the command in the above image.
We can check to make sure that the container image was downloaded as well as view any other images we might have. As shown below, you can see that I have three images available.
Starting the SQL Server Container
Now that we have a container image downloaded, we can now initiate setting up the container itself. Since SQL Server has already been installed within the container, it must now be configured for it to successfully start. Essentially, we need to tell Docker to run the image and pass in some parameters.
Here’s what they mean:
-e represents environment variables. Here we are accepting the end user license agreement (EULA) as well as supplying a SA password (it’s required).
-p represents the ports. The value on the left side of the colon is the port on the host and the value on the right side is the port within the container. In this example, both sides will be listening on port 1433, which is the default port for SQL.
-d represents which container image we want to run. In this example I want to run the 2019-latest image pulled from Microsoft.com.
docker run -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=yourStrong(!)Password' -p 1433:1433 -d mcr.microsoft.com/mssql/server:2019-latest
Running this command, if all is correct, will return a GUID looking value.
We can also see that the Docker dashboard now has a running instance.
Using Azure Data Studio, I can now connect to the localhost on port 1433. We can see that the version is 2019 CU4 which is the latest release for SQL Server 2019.
It’s that easy!
If you are stuck on a command, simply run docker + command + “–help” to get a list of available parameters. For example, if I need to see what parameters are available for the run command, I would type in “docker run –help”. The output would display all of the available parameters that you can supply with the “run” command.
Using Docker allows me to quickly and easily stand up an instance of SQL Server. Often times I just need an instance to test a feature or try to figure out how something works. All in all, this process takes about 5-10 minutes to finish and it significantly faster than standing up a VM. If you are new to Docker, try it out. It works on Windows too!
Watch for future blog posts on how to use Docker with SQL Server! Subscribe for email notifications when new content is posted!
One of the new phrases coming out of Microsoft is that “SQL is just SQL” regardless of what operating system it resides on. This was echoed during the keynote at SQL Bits 2019 by the Microsoft team, which you can watch here. Later that weekend, I gave a session about database internals. My presentation is about how data is structured within a row and why that matters. Understanding the internals of table structures, even in today’s age of technology, include SQL Server 2019 (which will be released in Q3/Q4 of 2019) is important. During my session, a question came up about how a data page is structured if SQL Server is sitting on top of a Linux server, such as Ubuntu. Does the data page have the same size and shape in Linux as it does in Windows?
Honestly, I wasn’t quite sure. Thankfully, my colleague, Joey D’Antoni (B|T) was present in the room and he was able to confirm that it’s just SQL and a data page has the same size and shape in Linux as it does in Windows. Once the session was over, I wanted to set out in verifying that for my own learning!
SQL Server on Linux
First things first, I needed a SQL Server instance that was sitting on top of the Linux eco-system. Naturally I turned to Azure and quickly stood up a brand-new SQL Server 2019 CTP 2.2 on Ubuntu. You can see below that the SQL Server is up and running, ready to service requests.
Now that I have the instance up and running, I can start to do some testing! First I need to create a database and then a table. Once that is complete, I will put a row into the table and then check the pages. All of this can be done via SSMS or Azure Data Studio. After all, SQL is just SQL so I can use the tools I’m familiar with to accomplish the work. Anyway, let’s do this!
Let’s create the database and table then insert a single record with a first name of “John” and an ID of “1”.
IF DB_ID('PagesOnLinux') IS NULL
CREATE DATABASE PagesOnLinux
CREATE TABLE dbo.table1 (id INT, fname VARCHAR(20))
INSERT dbo.table1 (id, fname) SELECT 1,'John'
Once we have a row inserted, we can use DBCC IN and DBCC PAGE to locate and examine the necessary pages. You an also use the DMV sys.dm_db_database_page_allocations to locate the data page that has the row on it. Since there is only a single row (and it’s much smaller than 8060) there is only a single data page.
We can see that page 336 is the one that we want to examine. Even on Linux, you still need to enable trace flag 3604. This trace flag will pipe the output of the DBCC PAGE command to the results pane of SSMS instead of the SQL Server log file.
DBCC TRACEON (3604)
DBCC TRACEOFF (3604)
The page header is where we will find the information we need. On the page header, there are two elements I want to point out:
m_freeCnt – We know from Paul Randal’s blog that this indicates how many bytes are free on the page. In this case, you can see I have 8075 bytes free. This indicates very quickly that the page size is most likely 8192, or 8Kb as we suspected. However, let’s verify it just to make sure.
Slot 0 & Length – SQL Server is zero based so slot zero indicates the first row on the page. Since I only inserted a single row, this would represent that inserted row. The length (in bytes) reflects how many bytes that row (or slot) is consuming. In this case, we can see that the row is 19 bytes.
Let’s do the math…..
8075 bytes free on the page. 19 bytes for the row.
8075 + 19 = 8094
8094 bytes in total. We also know again from Paul’s blog that a page is 8192 bytes in size and the header of a page is 96 bytes. That leaves 8192 – 96 = 8096 bytes so we are two bytes short (8096-8094 = 2) however we have not yet taken the slot array into account. The slot array is 2 bytes for every row so now we have 8075 + 19 + 2 + 96 = 8192 bytes.
This validates that a data page in SQL Server on Linux is indeed 8Kb!
In this aspect, it’s true what they say in that SQL is just SQL regardless of the platform it might be running on. We’ve proven from our experiment here that a data page, whether it’s on Windows or Linux, is still 8Kb in size. How your tables are constructed can really make a different in how your application behaves. Make sure to do your due diligence when designing tables and others will thank you for it!
Now that SQL Server 2017 has gone GA and SQL Server on Linux is a reality, you may wonder how it effects your licensing bill? Well there’s good news—SQL Server on Linux has exactly the same licensing model as on Windows. And Docker, if you are using it for non-development purposes (pro-tip: don’t use Docker for production, yet) is licensing just like SQL Server in a virtual environment, you can either license all of the cores on the host, or simply the cores that your container is using.
But What About the OS?
So let’s compare server pricing:
Windows Server 2016 Standard Edition– $882
Red Hat Enterprise Linux with Standard Support—$799
SuSE Enterprise Standard Support—$799
Ubuntu—Free as in beer
As you can see most of the licenses are the same whether you are on Windows or Linux. I’m not going to get into the religious war to say which is the best distro or if you need paid support, just simply putting the numbers out there.
HA and DR
There’s still more to come on the HA and DR story, so stay tuned. But assume the standard free license with SA for an idle secondary.
As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.
And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.