How to test for query regressions when upgrading

One of the things that you want to test for when doing an upgrade of SQL Server is query regressions. Regressions happen when the new version of SQL Server makes a poor decision when running the same code that you can before.  Testing an application against the new version should always be done, even though everything is supposed to work perfectly. You never know until you test, and since Microsoft hasn’t tested your specific workload, that leaves it up to you.

Prepping for the test

This testing does require a test machine to run your workload against, but since we have the cloud, doing testing like this is easy, and you can just destroy the VM when you’re done with your testing.

To do this testing, we’re going to need a few things.  The first is a machine to run our test databases on. In the perfect world, it’ll be the same size as the production server that you’re going to be running the new production system on. It doesn’t have to be, but the closer, the better.  We also need a  backup of the production database that we’re going to restore on the test machine.

We also need a SQL Profiler capture from the production system. If possible, you want the profiler capture to have every possible query that can be run.  This could be guaranteed by capturing data for a day or two, or it could be guaranteed by someone clicking all the options in the app while the trace is running.  You need a profiler trace as you need to play it back, and the easiest way to playback a capture is SQL Profiler.  It’ll take a lengthy profiler script to create the trace that it needed, but you can see the script at the bottom of this post.  You will want to change the path that the trace is writing to.  The script will output a number when it runs; you’ll need this number to stop the trace when you are ready to stop it.  The script will create a bunch of trace files that are 1 Gig in size when you run the script.  Make sure that you’ve got plenty of drive space for the capture.

Once you’ve captured the data, stop the trace (the stop script is at the bottom of this post as well) and move the capture to the test server.  On the test server, you’ll want to install the version of SQL Server that we’re going to move to (Hopefully SQL 2017 or newer).  Once SQL is installed, restore your production database then turn on the query store for that database (database properties).  You’ll also want to increase the size of the query store, 1000 Megs should be enough, but if it’s not, go ahead and make it bigger than that.

Once the query store is enabled (do NOT change the compatibility level yet), we’re ready for the first run of our profiler trace.

Checking for regressions

If you open SQL Profiler on the test server and open the first file of the trace.  Once that’s done, click the “Replay” drop-down menu, and then click “Start.” It’ll then prompt you to connect to a server. Enter the name of the test server.  It’s going to give you the option to use multiple threads, and you’ll probably want to switch to this option.

Once you are happy with these settings, click “OK” to start the replay.  Depending on how big the trace is, this might take a while. At a client, when I did this recently, we captured data for about 15 minutes, and it took about 7 hours to replay.

Once the replay is done, change the compatibility level to whatever version you plan to change the database to (usually the newest compatibility level).

At this point, we need to rerun the trace again.  Once the trace is done, we can go into the query store and see if things regressed.  Microsoft has made looking for regressed queries very easy; there’s a query store report you can open up. If you open the “Regressed Queries” report, it’ll show you all the queries that had plans that regressed to do changes in the SQL Server engine.  You can then investigate these queries and hint them or rewrite them to fix them.  Once those queries are all fixed, the system can be upgraded.

In the case of a client system that I was working on recently we had to hint a few queries, but those hints weren’t available in the old version of SQL Server that we were moving off of, so the hints couldn’t be applied until after the upgrade was done, but we just built that into the upgrade plan so there would be a couple more minutes of outage while we added hints to a bunch of procedures, then we opened the gates and let users onto the system.

Depending on the regressions that are found you might need some development time to see what’s going on (hint, we can help you figure out the regressions and what needs to be hinted and what needs to be rewritten).

Denny

 

To start the trace
declare @p1 int
declare @maxfilesize bigint = 1024
exec sp_trace_create @p1 output,2,N'c:\temp\Replay',@maxfilesize,NULL
select @p1
exec sp_trace_setevent @p1,14,1,1
exec sp_trace_setevent @p1,14,9,1
exec sp_trace_setevent @p1,14,10,1
exec sp_trace_setevent @p1,14,11,1
exec sp_trace_setevent @p1,14,6,1
exec sp_trace_setevent @p1,14,12,1
exec sp_trace_setevent @p1,14,14,1
exec sp_trace_setevent @p1,14,3,1
exec sp_trace_setevent @p1,14,35,1
exec sp_trace_setevent @p1,15,3,1
exec sp_trace_setevent @p1,15,35,1
exec sp_trace_setevent @p1,17,3,1
exec sp_trace_setevent @p1,17,35,1
exec sp_trace_setevent @p1,10,3,1
exec sp_trace_setevent @p1,10,35,1
exec sp_trace_setevent @p1,12,3,1
exec sp_trace_setevent @p1,12,35,1
exec sp_trace_setevent @p1,15,11,1
exec sp_trace_setevent @p1,15,6,1
exec sp_trace_setevent @p1,15,9,1
exec sp_trace_setevent @p1,15,10,1
exec sp_trace_setevent @p1,15,12,1
exec sp_trace_setevent @p1,15,13,1
exec sp_trace_setevent @p1,15,14,1
exec sp_trace_setevent @p1,15,15,1
exec sp_trace_setevent @p1,15,16,1
exec sp_trace_setevent @p1,15,17,1
exec sp_trace_setevent @p1,15,18,1
exec sp_trace_setevent @p1,17,1,1
exec sp_trace_setevent @p1,17,9,1
exec sp_trace_setevent @p1,17,10,1
exec sp_trace_setevent @p1,17,11,1
exec sp_trace_setevent @p1,17,6,1
exec sp_trace_setevent @p1,17,12,1
exec sp_trace_setevent @p1,17,14,1
exec sp_trace_setevent @p1,10,9,1
exec sp_trace_setevent @p1,10,2,1
exec sp_trace_setevent @p1,10,10,1
exec sp_trace_setevent @p1,10,6,1
exec sp_trace_setevent @p1,10,11,1
exec sp_trace_setevent @p1,10,12,1
exec sp_trace_setevent @p1,10,13,1
exec sp_trace_setevent @p1,10,14,1
exec sp_trace_setevent @p1,10,15,1
exec sp_trace_setevent @p1,10,16,1
exec sp_trace_setevent @p1,10,17,1
exec sp_trace_setevent @p1,10,18,1
exec sp_trace_setevent @p1,12,1,1
exec sp_trace_setevent @p1,12,9,1
exec sp_trace_setevent @p1,12,11,1
exec sp_trace_setevent @p1,12,6,1
exec sp_trace_setevent @p1,12,10,1
exec sp_trace_setevent @p1,12,12,1
exec sp_trace_setevent @p1,12,13,1
exec sp_trace_setevent @p1,12,14,1
exec sp_trace_setevent @p1,12,15,1
exec sp_trace_setevent @p1,12,16,1
exec sp_trace_setevent @p1,12,17,1
exec sp_trace_setevent @p1,12,18,1
exec sp_trace_setevent @p1,13,1,1
exec sp_trace_setevent @p1,13,9,1
exec sp_trace_setevent @p1,13,11,1
exec sp_trace_setevent @p1,13,6,1
exec sp_trace_setevent @p1,13,10,1
exec sp_trace_setevent @p1,13,12,1
exec sp_trace_setevent @p1,13,14,1
exec sp_trace_setevent @p1,13,3,1
exec sp_trace_setevent @p1,13,35,1
exec sp_trace_setevent @p1,70,1,1
exec sp_trace_setevent @p1,70,9,1
exec sp_trace_setevent @p1,70,11,1
exec sp_trace_setevent @p1,70,6,1
exec sp_trace_setevent @p1,70,10,1
exec sp_trace_setevent @p1,70,12,1
exec sp_trace_setevent @p1,70,14,1
exec sp_trace_setevent @p1,70,3,1
exec sp_trace_setevent @p1,70,35,1
exec sp_trace_setevent @p1,53,1,1
exec sp_trace_setevent @p1,53,9,1
exec sp_trace_setevent @p1,53,11,1
exec sp_trace_setevent @p1,53,6,1
exec sp_trace_setevent @p1,53,10,1
exec sp_trace_setevent @p1,53,12,1
exec sp_trace_setevent @p1,53,14,1
exec sp_trace_setevent @p1,53,3,1
exec sp_trace_setevent @p1,53,35,1
exec sp_trace_setevent @p1,11,1,1
exec sp_trace_setevent @p1,11,9,1
exec sp_trace_setevent @p1,11,11,1
exec sp_trace_setevent @p1,11,6,1
exec sp_trace_setevent @p1,11,10,1
exec sp_trace_setevent @p1,11,12,1
exec sp_trace_setevent @p1,11,14,1
exec sp_trace_setevent @p1,11,3,1
exec sp_trace_setevent @p1,11,35,1
exec sp_trace_setevent @p1,74,1,1
exec sp_trace_setevent @p1,74,9,1
exec sp_trace_setevent @p1,74,11,1
exec sp_trace_setevent @p1,74,6,1
exec sp_trace_setevent @p1,74,10,1
exec sp_trace_setevent @p1,74,12,1
exec sp_trace_setevent @p1,74,14,1
exec sp_trace_setevent @p1,74,3,1
exec sp_trace_setevent @p1,74,35,1
exec sp_trace_setevent @p1,71,1,1
exec sp_trace_setevent @p1,71,9,1
exec sp_trace_setevent @p1,71,11,1
exec sp_trace_setevent @p1,71,6,1
exec sp_trace_setevent @p1,71,10,1
exec sp_trace_setevent @p1,71,12,1
exec sp_trace_setevent @p1,71,14,1
exec sp_trace_setevent @p1,71,3,1
exec sp_trace_setevent @p1,71,35,1
exec sp_trace_setevent @p1,72,1,1
exec sp_trace_setevent @p1,72,9,1
exec sp_trace_setevent @p1,72,11,1
exec sp_trace_setevent @p1,72,6,1
exec sp_trace_setevent @p1,72,10,1
exec sp_trace_setevent @p1,72,12,1
exec sp_trace_setevent @p1,72,14,1
exec sp_trace_setevent @p1,72,3,1
exec sp_trace_setevent @p1,72,35,1
exec sp_trace_setevent @p1,100,1,1
exec sp_trace_setevent @p1,100,9,1
exec sp_trace_setevent @p1,100,11,1
exec sp_trace_setevent @p1,100,6,1
exec sp_trace_setevent @p1,100,10,1
exec sp_trace_setevent @p1,100,12,1
exec sp_trace_setevent @p1,100,14,1
exec sp_trace_setevent @p1,100,3,1
exec sp_trace_setevent @p1,100,35,1
exec sp_trace_setstatus @p1,1

To stop the trace
declare @p1 int = 2 --{Whatever value is outputted from the first query}
exec sp_trace_setstatus @p1,0
exec sp_trace_setstatus @p1,2

The post How to test for query regressions when upgrading appeared first on SQL Server with Mr. Denny.

Share

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?