Understanding Your Options for Running SQL Server on Azure

Published On: 2019-11-11By:

I was at Microsoft Ignite last week, and I spent the better of part of 3 days working in the “SQL Server” booth. I put SQL Server in quotes, because in 2019 that term is stretched. Our booth covered SQL Server 2019, Arc (which you should read about because it’s really cool, but is the subject of another post), SQL on Edge, SQL Server Big Data Clusters, Azure SQL Database, and Azure SQL Database Managed Instance, and a few other things that I’m too tired to type. And that doesn’t even include Microsoft Spectre (which is the artist formerly known as Azure SQL Data Warehouse)

image_thumb.png

My big takeaway from working the booth, was how most of the customers I spoke to, didn’t understanding the differences between the various platform options. In fact, I think I took the following question over 100 times: “We’re currently on-premises, but moving to the cloud, can you explain the difference between Azure SQL Database, Managed Instance, and SQL Server in a VM?” The fact that we had to answer that so much, isn’t a failing on the part of the customers, as it is Microsoft and us MVPs not communicating the message effectively. So, in order to do my part, I’m going to write a post (I also wrote a Redmond column last month, but this post will go into some more detail). I’m not going to talk about pricing in detail in this post, as it’s effectively the same as the two services.

First Things First—Do you Know What PaaS Means?

Azure SQL Database and Managed Instance are both known as Platform as a Service (PaaS) offerings. This means you the DBA will manage a lot less than you typically would on either an on-premises SQL Server, or even a VM running SQL Server in Azure on the Infrastructure as a Service (IaaS) offering. If you run one of the PaaS offerings, Microsoft is responsible for the following tasks:

  • Backup of your database and transaction logs
  • High Availability within your Azure region, with a financial guarantee (the number of 9s depends on config, but it’s pretty solid)
  • Installation, configuration, and patching of SQL Server software
  • Installation, configuration, and patching of Windows server
  • Consistency checks of your database
  • Performance monitoring of your database (This is configured for you, you still have to react to the numbers)

What tasks does that leave you to do?

  • Configuring disaster recovery across Azure regions—while this is just a few clicks, you get to convince your boss to pay for it, and it’s not turned on by default
  • Performance tuning—getting utilization down by query optimization can save your company money. It’s a quantifiable part of your job now
  • Network architecture—you need to be able to make sure your virtual networks in Azure can all talk to your database services, and in a secure fashion
  • User Management and Security (though there are some security benefits in PaaS)
  • Helping your business derive value from the data you are storing in the database

Ok, Tell Me About Cloud Offerings

I’m going to start out by talking about running SQL Server in an Azure VM. Why would you choose to do this, when the aforementioned PaaS services do all of this magic for you?  The main reason I can think of is vendor support. Your business runs some high quality ISV software that requires you to remain on SQL Server 2012 CU1 (just kidding, you know it’s actually SQL 2005 RTM, and Windows 2003), and you can only run this mission critical conference room projection software on that version of SQL Server.

The other reasons why you might choose to run SQL in a VM besides vendor support (which is common) is that the size of your database doesn’t work well with the current limits in Azure PaaS (which is 8 TB in Managed Instance, and effectively unlimited in Azure SQL Database Hyperscale), or that you are willing to trade off some availability for reduced cost of running in a standalone VM.

What do I mean by costs? While PaaS offerings include high availability, that’s something Microsoft has to include in your pricing (otherwise, their shareholders, including me, would murder them), so compared to a VM that is standalone, the PaaS offering will commonly be more expensive. To get a better cost estimate, you should really multiply your IaaS costs by 3. (Also, VMs have pretty good availability—it can be enough, especially for departmental apps).

Azure SQL Database/Azure SQL Database Hyperscale

This is truly one of the OG Azure offerings, dating back to 2009/10. It is really designed to be a standalone database offering. Think of an app, that needs a single database, and wants minimal effort around spinning that up, and you have the right answer for Azure SQL Database. The cost for getting started is also really cheap at $5/month for a basic DB (it can be way, way more than that, the P15 database is $15k USD/month, and there’s a new M-series offering that will be even more). Effectively, an Azure SQL Database is a single user database running in a SQL Server instance somewhere in the cloud. There’s a feature called elastic pools that confuses this a little bit.

Elastic pools are a group of Azure SQL Databases that share the same resources. This is effectively a SQL Server instance somewhere in the cloud that hosts multiple user databases. The elastic pool feature is mostly designed for applications where each customer gets a copy of the same database, and workload is mixed at different times of the day, so resources are utilized most effectively.

Azure SQL Database has some tools called elastic query, elastic transactions, and elastic jobs, that allow you to perform actions across multiple databases (elastic is a bit of a misnomer here, there’s no direct connection to elastic pools), but in general cross-database queries using classical three part naming database.schema.object do not work in Azure SQL Database, so if you need to joins or complex queries across databases, this probably isn’t your cloud product.

Way back when Azure SQL Database was size limited to 5 GB, then ultimately 4 TB, until the hyperscale architecture was developed. Hyperscale is a completely model for physical database infrastructure design (read more about it here, it’s really cool) and should be able to scale to near petabyte scale. If you’ve got more than that much data in a single database, we really need to talk. The one challenge with SQL DB and database size, is that you can’t restore to it, you have to use a database export and load data using a BACPAC, so it can be very slow for very large databases. My guidance here is that if you have a very large on-premises database that you’d like to move, Managed Instance may be a better choice, as the migration path is easier. If you are building a new application in the cloud, Azure SQL Database is frequently the best choices.

There are a few other limitations of on-premises features that Azure SQL Database doesn’t support. Those are as follows:

  • Common Language Runtime (CLR)
  • Change Data Capture
  • Cross-Database Transactions
  • Database Mail
  • Filestream
  • Linked Servers* (sort of—you can connect to external data sources)
  • Replication Source (you can publish to, but not from an Azure SQL DB)
  • Resource Governor
  • Service Broker
  • SQL Server Agent

Of those features, the biggest one is SQL Server Agent. There are a number of different ways to schedule jobs to run in Azure, the most common being Azure Automation, which allows for PowerShell or Python. You can also have an Agent job on a SQL VM running full-blown SQL Server, though that seems less practical.

I mentioned new application development as a target for Azure SQL Database. Since, SQL DB allows for your database to be a replication target, it can also be a good use case for reporting, particularly for field users who are afar from the home office. You can leverage either geo-replication, or just have multiple replication targets, and allow field users to report against them.

From a network perspective, your SQL Database is on the internet, with a firewall in front of it (either at the server or the database level). You can also elect to only allow traffic from a Virtual Network within Azure, however your database is still on shared networking infrastructure (albeit secured).

Azure SQL Database Managed Instance

This service came much later, being announced at Build in 2017, and going GA last year. Managed Instance was specifically designed to make it easier to move from on-premises SQL Server to a PaaS services. One of the initial differences was that Managed Instances uses a virtual core-based pricing model (Azure SQL Database still offers a legacy DTU based model), so you pay a price per core, depending on your tier.

The two tiers of Managed Instances (and vCore based Azure SQL Database) are general purpose and business critical. The two key differences between these tiers are performance and high availability models—general purpose (GP) uses networked SSDs and will generally have IO latency in the 5-10 ms range. If you are using GP, you need to read this blog post on how to optimally configure your storage, as other it can be really painful.  GP also uses the underlying storage model for its high availability. Business critical (BC) uses the local SSD that’s in each server host for storage, which is lower latency than a network call, and effectively uses an AlwaysOn Availability Group with four replicas for its availability model. You can also query the secondary replica on BC, by using the application intent=read only connection syntax in the SQL client.

In terms of migration, you can restore a backup (from Azure Blob Storage using the RESTORE FROM URL syntax) into Managed Instance. For a more real-time migration strategy, you can use the Data Migration Services tool which will allow for an almost zero-downtime migration. There’s been talk of allowing log shipping, or at least “log shipping” (via an API), but last I checked (and I’ll check again) you had to use DMS.

Features in MI are much closer to on-premises SQL Server. The big gap closers between MI and SQL DB are that the SQL Agent is there, CDC is there, and cross-database queries can be performed. The following list is the features that are included in Managed Instance and not in SQL DB:

  • Copy-only backup
  • CDC
  • CLR
  • Cross-Database Queries/Transactions
  • DBMail
  • Linked Servers (Only to MI and other SQL Servers)
  • Resource Governor
  • Service Broker (within instance)

The gaps to on-premises are as follows:

  • R Services
  • SQL Server Analysis/Integration* (there’s a way to do this with Azure Data Factory, but it costs money)/Reporting Services
  • Filestream

There are a few other differences, but I went for the ones that I see as most commonly affecting customers. One thing to note for both MI and Azure SQL DB is that they rely on Azure Active Directory and not on-premises Active Directory to provide authentication. I hadn’t seen this as a big gap with my customers, however it came up a few times at the booth last week, and Microsoft is researching options in this space.

The other limitation Managed Instance currently faces, is that there is a maximum of 8 TB database total size currently. The Managed Instance team is in the process of implementing hyperscale, and this limit will go away when that happens, which will likely be sometime next year.

So, I’ve almost written 2000 words on this. So therefore it needs a tl;dr

Azure SQL Database—Use It If

  • You are building a new cloud based application
  • You want to build a reporting option for field users
  • You are building a customer facing data platform and you want simplified networking and auth

Azure SQL Database Managed Instance—Use it if

  • You are migrating 3rd party ISV apps to a PaaS model
  • You want a PaaS model, but don’t want to change any of your code
  • Your app depends on SQL Agent, or cross-database queries

SQL Server in a VM—Use It If

  • you are stuck on an old version of SQL Server
  • You have to use on-prem Active Directory for authentication
  • 9% availability is good enough for your application
  • You are one of those people who run SSAS/SSIS/SSRS/R Services/RDBMS/DQS/MDS all on the same server to maximize every penny of your licensing
  • You are the person who uses Filestream

 

 


Contact the Author | Contact DCAC

Colorado DOT Post Mortem–Don’t Do Stupid #$%^

Published On: 2019-10-16By:

A good policy in your technology career, or life in general, is to, to quote a former president, “don’t do stupid shit”. If we look historically at major IT outages or data breaches, there’s always some universal tenants–some piece of critical infrastructure was left completely unsecured, or someone put the private keys on Github. In fact, just last week I wrote my column for Redmond Mag, on some common patterns, mostly specific to SQL Server, but applicable to many other types of applications. I wanted to call it Data Breach Bingo, but I didn’t enough space in my column for 25 vulnerabilities.

Stupid GIF - Find & Share on GIPHY

While we’re here, it’s important that when thinking about security, you think about the basics first. The GRU operatives working for the Russian government likely aren’t targeting your data–it’s some bot network looking for a blank SA password running on a SQL Server that’s got port 1433 open to the internet. If you are on twitter, I highly recommend following Swift on Security a rather serious parody account (of Taylor Swift) that focuses on good security practices. That twitter account also hosts an excellent website called DecentSecurity.com that talks about the basics of how you should be securing your personal and business environments. While you might not be protected from the GRU, you can avoid most of the other “hackxors” in the world that are really just dumb bots.

The reason why I wrote this post, was that my colleague Meagan Longoria (b|t), shared with me a post-mortem from a ransomware attack on the Department of Transportation in the state of Colorado. I would really like to applaud the state for putting out this document, even though it doesn’t paint the organization in the best light. I’m going to use this space to talk about some of the things highlighted in the report.

A virtual server was created on February 18, 2018. The virtual server was directly connected into the Colorado Department of Transportation (CDOT) network, as if it was a local on premise system. The virtual server instance also had an internet address and did not have OIT’s standardized security controls in place.”

Ok, this sounds pretty much like they deployed a VM to either Azure or AWS, and they had a network connection back into their on-premises network. This in and of itself is not problematic, but not applying security policy in the provider is a problem. Both Azure and Amazon have standard security templates and policies that can be configured and applied at deployment.

“The account utilized to establish the connection into the CDOT network was a domain administrator account – this is the highest level privileged account, and means that 1) the account cannot be disabled for too many failed login accounts, and 2) it provides the highest level of access to the agency domain controllers (gatekeepers for all access to everything in the department).”

There’s a lot going on here. I read this as that they joined the server to the Active Directory domain, and used a domain admin account to perform the join. Neither of these are problematic. The next sentence is wrong–domain admin accounts can absolutely be disabled, however this indicates to me, that the department had the “never disable this login” checkbox check. Bad move. Also, in an ideal world, you would never have a domain admin log into member servers, so its password was never in kernel memory, but that’s not how this (or many) hacks work.

“Later, OIT was informed by the vendor that when an external IP address is requested, the vendor automatically opens the Remote Desktop protocol to the internet. The Remote Desktop protocol is how this attack was initiated.”

This is my biggest complaint about both Amazon and Microsoft Azure, especially in the timeframe of this attack, both services defaulted to having a public IP address and didn’t actively discourage public ports from being opened. While it’s a totally bad practice that should be enforced by policy, up until a few months ago on the Azure side, it was very easy to do.

“An attacker discovered this system available on the internet, broke into the Administrator account using approximately 40,000 password guesses until the account was compromised. From there, the attacker was able to access CDOT’s environment as the domain administrator, installing and activating the ransomware attack.”

This is a statement that will sound somewhat impressive to lay people. 40,000 password guesses sounds like a whole lot, however for an automated script that would take seconds. This means they either had approximately 3 character passwords (I’m guessing they didn’t), or the domain administrator was using a common dictionary based password. You can use add-ons to Active Directory to limit the types of passwords users can use, and while this may be painful for normal users, it is absolutely critical for high privilege users like DBAs or System Admins. They should also have separate user accounts for logging in as domain admin, and multi-factor authentication configured.

This is also the part where I would usually say you should have network segmentation configured to block the malware from spreading, but if someone pwns your domain controllers all the segmentation in the world won’t help that much. An attacker could even go as far to install their malware via group policy. Good luck cleaning that one up.

The state of Colorado had a really robust backup solution, and some good network practices (smart firewalls) that were able to limit the spread of the malware. So they were able to recover from this attack relatively quickly. It’s important to worry about big things, but doing all of the little things right and most importantly not doing stupid things will go a long way to securing your environment.

 

 


Contact the Author | Contact DCAC

Azure SQL Database Price and Performance vs Amazon RDS

Published On: 2019-10-15By:

Yesterday, GigaOm published a benchmark of Azure SQL Database as compared to Amazon’s RDS service. It’s an interesting test case that tries to compare the performance of these platform as a service database offerings. One of the many challenges of this kind of a study is that the product offerings are not exactly analogous. However, GigaOm was able to build a test case where they used similar sized offerings as shown in the image below.

Screen Shot 2019-10-15 at 3.17.40 PM

GigaOm used TPC-E which is a blended OLTP workload which involves both read-only and update transactions. In the raw performance numbers, SQL DB was a little bit better in terms of overall throughput, but where Azure really shines in comparison to AWS is pricing.

In terms of raw pricing, there is a significant difference between Azure SQL DB and AWS pricing for these two instances. Azure is $25,000/month less, however it’s not quite an apples to apples comparison, as Azure SQL DB is single database service as opposed to RDS which acts more like Managed Instance, or on-premises SQL Server. If you use use the elastic pool functionality in Azure SQL DB which is the same cost as single database SQL DB, you get a better comparison.

 

Screen Shot 2019-10-15 at 3.54.24 PM

Where the real benefits come in, are that Microsoft allows you to bring on-premises licenses to Azure SQL Database, and the fact the Microsoft offers a much lower cost for 3 year reservations. (Which you can know pay for on a month to month basis). This is a big advantage that Microsoft has over Amazon is that they control the licensing for SQL Server. You can argue that this isn’t fair, but it’s Microsoft’s ball game, so they get to make the rules.

There will always be an inherent feature advantage to being on Azure SQL DB over AWS RDS, as Microsoft owns the code and can roll it out faster. While AWS is a big customer, and will be quick to roll out new updates and versions, they are still subject to Microsoft’s external release cycle whereas Azure is not. In my opinion, if you are building a new PaaS solution based on SQL Server, your best approach is to run on Azure SQL DB.


Contact the Author | Contact DCAC

Cloud Field Day–Solo.io #CFD6

Published On: 2019-10-14By:

As I mentioned, I was in Silicon Valley a couple of weeks ago for an analyst event, and got to meet with a variety of the companies. The final company we met with on Friday, was Solo.io, and I have to say they knocked it out of the park. Their technology was super interesting and their founder Idit Levine, and their CTO, Christian Posta were excellent presenters who were clearly enthusiastic about their product.

SoloIO

So what does Solo.io do? In the modern microservices oriented world, we have distributed systems which are nearly all API driven. Solo.io has a number of products in this space, but their core product Gloo is a modern API gateway that securely bridges modern applications like Lamba or Azure functions to both legacy monolithic applications as well as modern databases running in Kubernetes pods.

They also have another open source project called SuperGloo, which is an abstraction layer for service mesh architecture. A service mesh provides modern applications with monitoring, scaling, and high availability through APIs rather than discrete appliances. Istio from Google is best known tool in this space, and SuperGloo can work with it, and other service meshes in the same architecture.

The other really interesting tool that Solo.io highlighted was called Squash, which is a debugger for distributed systems. If you’ve ever tried to troubleshoot a distributed system, even figuring out where to start can be challenging. By acting as a bridge between Kubernetes (drink) and the IDE, you can choose which pods or containers you are debugging and set breakpoints, or change variables during runtime.

 


Contact the Author | Contact DCAC
1 2 3 29

Video

Globally Recognized Expertise

As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.

And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.

Awards & Certifications

Microsoft Partner    Microsoft MVP    Microsoft Certified Master    VMWare Partner    VMWare vExpert
   Best Full-Service Cloud Technology Consulting Company    Insights Sccess Award    Technology Headlines Award    Golden Bridge Gold Award    CIO Review Top 20 Azure Solutions Providers