Building Storage for SQL Server (and other database) Virtual Machines in the Cloud

Published On: 2019-09-18By:

I wrote a couple of weeks ago, about what not to do with backups in Azure. Because I’ve seen a few improperly configured VMs lately, I wanted to talk about the way the storage works in Azure, and the way we traditionally did things on-premises.

Old School

If you still buy your storage from a three letter company, and your sales rep drives an expensive German car, and has better taste in shoes than Imelda Marcos, you might still configure your storage this way. You might create a separate disk volume for TempDB, transaction log files, and data files. Ideally, you are backing up to a separate storage appliance, and not to the same storage array where your data files live.

This architecture design dates back to when a storage LUN was literally a built of a few disks, and we wanted to ensure that there were enough I/O operations per second to service the needs of the SQL Server, because we only had the available IO of a few disks.

As virtualization became popular storage architectures changes and the a SAN lun was carved out into many small extents (typically 512k-1MB depending on vendor) across the entire array. What this meant was that with modern storage there was no need to separate logs and data files, however some DBAs did, however in an on-premises world there was no penalty for this.

Note: There is a scenario where you would want multiple disk devices in Windows. Under very high I/O workloads, IOs can queue at the Windows disk device level. This is an uncommon performance scenario in my experience

assorted title cassette tapes

Photo by Vova Krasilnikov on Pexels.com

Enter the Cloud

Instead of physical disks, in the cloud your “disk devices” are virtual hard drive files, which are stored across 3 different physical disks on the infrastructure. All storage performance is controlled by quality of service settings on the Azure infrastructure. Each disk you add increases both your IOPs and storage capacity.  Also, each virtual machine has a fixed limit on the number of IOPs available to it (while this is very possible on-premises, it’s far less common).

We then translate this to the operating system level, and in this specific case, Windows Server. In order to get maximal volume and performance out of our disks, we use Storage Spaces in Windows to create pools of storage. The exciting part here is that you get to use RAID 0, since Azure’s (or Amazon’s) infrastructure is providing your RAID. This means if we have 20 1 TB disks, with 5000 IOPs each, we can have a 20 TB pool, that theoretically supports 100k IOPs. (Most VMs in Azure don’t support that level of IO performance, but a couple do).

It’s also important to know that you need to specify the number of columns parameter when building your storage spaces pools in Windows. If you have more than four disks your need to use PowerShell for that–I’ll write more about that next week. But here’s some info from the product teams.

https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sql/virtual-machines-windows-sql-performance

This post has good info on columns, but it’s from 2014 and the rest of the storage information is very dated (premium storage didn’t exist). I’m only including because it’s the best explanations of columns that I’ve seen.

Best Practices & Disaster Recovery for Storage Spaces and Pools in Azure

What this means, is that in order to maximize your database server’s IO performance, you should create one large pool, with all the disks. Throw your system DBs and your data and log files all on that volume. And please don’t write your backups to that disk. (BACKUP TO URL was invented for this purpose).

You can also throw TempDB on the local D: drive, which is ephemeral (it goes away when your machine reboots, but so does TempDB), and can over slightly lower latency.

Note, if you’re reading this and you are using Ultra Disk, I haven’t tested any of this stuff with Ultra Disk because I haven’t been able to test it. I think you may not need to stripe disks to achieve good performance.

 


Contact the Author | Contact DCAC

SQL Server Statistics Health Reminder

Published On: By:

I’ve written about statistics in SQL Server a few times now. Through conversations  I am reminded that not everyone keeps their table statistics up to date. Keeping your statistics up to date is one of the more critical components of SQL Server performance issues. Please be sure you are doing more than just index maintenance to maintain your statistics.  Index maintenance  only maintains statistics created by indexes and single field predicate created table statistics. There are also column statistics that need to be maintained that are created by predicates usage in your queries or even manually created. For each insert, update, or delete the distribution of your data changes and can skew the optimizer’s estimates, so ensuring that your execution plans’ number of Actual Rows versus Estimated Rows are aligned will allow SQL Server to generate the optimal execution plan.

Over Estimations of Rows (Actual > Estimated) leads to:

  • Selection of parallel plan when a serial plan might be more optimal
  • Inappropriate join strategy selections
  • Inefficient Index Navigation (scan verses seek)
  • Inflated Memory Grants

Under Estimations of Rows (Actual < Estimated) leads to:

  • Leads to SPILLS to DISK because of not enough MEMORY was requested
  • Selection of serial plan which parallelism would be more optimal
  • Inappropriate join strategies
  • Inefficient Index selection and navigation strategies

Maintain your statistics by doing the below at least weekly. (Note: some systems may require far more frequent updates–I’ve had to update stats every 10 minutes on a particularly troublesome table)

Set AUTO_UPDATE_STATISTICS =TRUE for each database, however,  this option will only update statistics created for indexes or single-columns in query predicates. Optionally also Set AUTO_UPDATE_STATISTICS_ASYNC =TRUE for performance gains, you can read more on that in my prior blogs.

If you have larger tables in your environment and are not using SQL Server 2016 or higher be sure to examine Trace Flag 2371. This trace flag is available to assist in keeping stats up-to-date in large tables. Currently, the algorithm that is used to automatically update statistics is 20% + 500 rows.  Trace Flag 2371 changes this algorithm to a sliding scale. Using this trace flag will drastically increase the frequency of which statistics updates occur on your larger tables, which in turn give the optimizer much better estimates to work with.

Run EXEC sp_updatestats to update ALL statistics routinely. I suggest creating a SQL Agent job to run routinely.

OR

If you are using Ola’s Index Scripts consider adding the below parameters.

@UPDATESTATS=ALL

@ONLYMODIFIEDSTATICS=Y * (this can create old stats if the fields have not been modified in a while, I am not a huge fan of this option)

Example:

sqlcmd -E -S $(ESCAPE_SQUOTE(SRVR)) -d master -Q “EXECUTE [dbo].[IndexOptimize] @Databases = ‘USER_DATABASES’,    @UpdateStatistics = ‘ALL’ , @ONLYMODIFIEDSTATICS=Y

 

More Information

Here are the links to my prior statistics blogs. I recommend reading them in the below order to help you better understand statistics and their role in SQL Server performance. Make sure you do your part as a DBA and remember to keep your statistics up to date as much as possible to help the optimizer better do its job.

Importance of Statistics

Synchronous VS Asynchronous Statistics Updates

Keeping Large Table Statistics Current -TF2371

 


Contact the Author | Contact DCAC

Heading to the PASS Summit 2019? There’s still time to sign up for the SQL Karaoke Party!

Published On: 2019-09-17By:

Are you planning on heading to the PASS Summit in November 2019 (or just be in Seattle that week because you live there, work there, etc)?  There’s still time to sign up for the SQL Karaoke party on Tuesday November 5th, 2019 at 9:30pm. It’s going to be a great night of hanging out with friends, watching some friends sing some great songs (they’ll have a great backup band) and have an all together great time.

Go to the Event website, and register. Show up and have a great time.

Denny

The post Heading to the PASS Summit 2019? There’s still time to sign up for the SQL Karaoke Party! appeared first on SQL Server with Mr. Denny.


Contact the Author | Contact DCAC

Using Powershell for SQL Saturday Logos

Published On: 2019-09-13By:

Out of necessity are born the tools of laziness.  This is a good thing.  I have found that organizing and running a SQL Saturday event is a great way to create scripts and processes that make an organizers life that much easier.  The 2019 Louisville SQL Saturday helped me to create a quick script that would download all of my sponsor logos into a single folder.

The script is written in PowerShell simply because PowerShell has great versatility and it suited my needs.  The goal was to be able to download the logos and send them off to one of my volunteers who then was going to put them on signage and get them printed.At the time I had no easy way to do with without manually pulling each one off the site.

Let’s get to it!

The Script

First, I need to set some variables to make it easier to use.  I could make these into parameters for usability, however, for my needs since I only run this once a year having just variables is acceptable for me.

The event number is the number that corresponds to the particular SQL Saturday event you want to download the logos.  Note that this would work for any event, not just the one you might be organizing.

$eventNum = "883"
$outputfolder = "c:\temp\SponsorLogos"
$compress = $True

Next, I need to fetch the XML feed from the event.  The XML feed has a wealth of information in it, including the image URL for all of the sponsors.

I will also get the sponsor name, what level they sponsored at (that’s the label column) and the URL for their logo.

#let's get the XML from the SQL Saturday website
[xml] $xdoc = Invoke-WebRequest -Uri "http://www.sqlsaturday.com/eventxml.aspx?sat=$eventNum" -UseBasicParsing

#we only need a subset of each node of the XML, mainly the sponsors
$sponsors = $xdoc.guidebookxml.sponsors.sponsor | select name, label, imageURL

We want to ensure that our output folder (the path from the variable above) exists otherwise the process won’t work.  If the folder doesn’t exist, it will be created for us.

If there is an error, there is a CATCH block that will capture the error and react accordingly.

#let's make sure the folder exists
"Checking folder existence...."
if (-not (test-path $outputFolder)) {
    try {
        New-Item -Path $outputFolder -ItemType Directory -ErrorAction Stop | Out-Null #-Force
    }
    catch {
        Write-Error -Message "Unable to create directory '$outputFolder'. Error was: $_" -ErrorAction Stop
    }
    "Successfully created directory '$outputFolder'."
}else{
    "Folder already exists...moving on"
}

Now that I have a destination folder, I can begin to download the logos into the folder.   In this block, I will loop through all of the sponsors.

First, I need to do some clean up in the sponsor names.  Some sponsors have commas or “.com” in their name and I wanted to use the sponsor name as the file name so I knew who the logo belonged to.  Once the cleanup is done, I used the INVOKE-WEBREQUEST cmdlet to fetch the file from the respective URL and output the file into the destination directory.

#give me all of the logos
foreach ($sponsor in $sponsors){
    $filename = $sponsor.imageURL | split-path -Leaf

    #get the file name and clean up spaces, commas, and the dot coms
    $sponsorname = $sponsor.name.replace(" ", "").replace(",","").replace(".com","")
    invoke-webrequest -uri $sponsor.imageURL -outfile $outputfolder\$($sponsorName)_$($sponsor.label.ToUpper())_$($fileName)
}

Since I will be sending this to a volunteer to utilize, I wanted the process to automatically zip up the folder to make it easier.  I’ll name the archive the same name as the folder so I can use the SPLIT-PATH cmdlet to get the leaf leave of the directory path, which is the folder name.

Using the COMPRESS-ARCHIVE cmdlet, I can then compress the folder and put it put it into that same folder.

# zip things up if desired
If ($compress -eq $true){
    $filename = $outputfolder | split-path -Leaf
    compress-archive -path $outputFolder -DestinationPath $outputfolder\$($filename).zip
}

Finally, I wanted the process to open the folder when it was done.  This is simple accomplished by calling “explorer” along with the destination folder name.  This will launch the folder in Windows Explorer

# show me the folder
explorer $outputfolder

Summary

Powershell is a great way to quickly and easily accomplish tasks.  Whether that is working with online data or even manipulating things on your local computer, this was a quick and easy way to make my life as a SQL Saturday event organizer that much easier.

You can also find the full script on my GitHub Repository.

Enjoy!

© 2019, John Morehouse. All rights reserved.


Contact the Author | Contact DCAC
1 2 3 432

Video

Globally Recognized Expertise

As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.

And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.

Awards & Certifications

Microsoft Partner    Microsoft MVP    Microsoft Certified Master    VMWare Partner    VMWare vExpert
   Best Full-Service Cloud Technology Consulting Company    Insights Sccess Award    Technology Headlines Award    Golden Bridge Gold Award    CIO Review Top 20 Azure Solutions Providers
Share via
Copy link