Knowing your data is very important when it comes to writing code. Now I’ll admit that I am very far from being a developer, however as a DBA, I spend much of my day’s performance tuning code. In doing so, I get to see many ways code can introduce excess database reads.,
One of the most recent things I have come across has to do with NULLs. The environment I was working in had no default values in their table design, so it was riddled with NULL values. Over the years they had implemented coding standards to try and mitigate these NULL values within their code.
In every column search or join they would start with (VALUE ISNULL or VALUE IS NOT NULL and VALUE = @parameter) as one example. Adding this syntax to a query reduces SARGAbility causing a lot of unnecessary reads and overhead. This is especially true when the column definition is set to NOT NULL and the check was just put in place as part of a standard. For clarification SARGAbility just means searchability and the ability to seek within an index for faster performance.
Another practice I’ve seen lately is the use of LTRIM and RTRIM functions as a standard clean up method. First, I highly recommend doing this on the application side or upon whatever insert method you use as the data is being populated into your tables. It is a much better practice to clean up the data once rather than coding for the clean up in every WHERE, JOIN or CASE statement in each query you write.
Let’s look at a simple query that incorporates both of these methods. Note: we will be turning on STATISTICS IO, TIME to look at our performance improvements.
First let’s look at our table design and note a few things. We will be querying WorldWideImporters Sales.Invoices table using SalesPersonID which is an INT set to NOT NULL, IsCreditNote a BIT NOT NULL and ConfirmedRecievedBy which is a nvarchar(4000) that allows NULLs.
After looking at our table let’s take note of the WHERE clause. The first check is validating if IsCreditNote IS NOT NULL, then we check on SalespersonPersonID for a specific value and lastly, we a cleaning up our free form field ConfirmedRecievedBy for a specific person’s name.
SET STATISTICS IO, TIME ON SELECT BillToCustomerID, OrderID, InvoiceDate FROM [WideWorldImporters].[Sales].[Invoices] WHERE IsCreditNote IS NOT NULL AND SalespersonPersonID = 7 AND LTRIM(RTRIM(ConfirmedReceivedBy)) ='Aile Mae' ;
Let’s run this and see what the Stats and Query plans shows us.
As I do with any performance tuning the first things that I note is CPU time and elapsed time then Ill note table scans and reads. We will use these metrics to measure our improvements. I would like you to take note of the tables it uses to complete the query. There is the Invoices table along with a Workfile and a Worktable it needed to use in order to return the desired results.
Next let’s look at our execution plan and note our seeks and scans. You can clearly see we are getting, and Index seek on the Primary Key as well as a Scan on our index for ConfirmedRecievedBy. It’s also letting us know that we have a missing index but let’s ignore that for now.
Now that we have some performance data, let’s look at some changes we can implement to make it behave better.
Since we looked at our table design, we can comment out the IsCreditNote IS NOT NULL check simply because does not allow NULL values so no need to check for them.
SELECT BillToCustomerID, OrderID, InvoiceDate FROM [WideWorldImporters].[Sales].[Invoices] WHERE --IsCreditNote IS NOT NULL --AND SalespersonPersonID = 7 AND LTRIM(RTRIM(ConfirmedReceivedBy)) ='Aile Mae' ;
Executing the query again, you’ll note that the plan remains the same, but we see a significant reduction in CPU time and elapsed time just by removing unnecessary checks.
Finally, we will execute the query again this time assuming our application managed the cleanup of our values prior to insertion for the ConfirmedRecievedBy column. We will remove the LTRIM & RTRIM functions allowing the optimizer to seek on our index instead of scanning all the values.
SELECT BillToCustomerID, OrderID, InvoiceDate FROM [WideWorldImporters].[Sales].[Invoices] WHERE --IsCreditNote IS NOT NULL -- AND SalespersonPersonID = 7 --AND LTRIM(RTRIM(ConfirmedReceivedBy)) ='Aile Mae' ; AND ConfirmedReceivedBy ='Aile Mae';
Now, you can see a huge improvement. We now are only doing 2 scans with 50 logical reads verses 422 and CPU time is now 0 compared 47ms. Our elapsed time even dropped from 230 to 127 which is over 100ms less. Furthermore, note that the missing index suggestion is now gone. It is important that you don’t just add indexes because the optimizer suggests it, wait until you are done tuning your code before addressing you index needs. Lastly note by making these changes the optimizer no longer requires the use of worktables nor workfiles.
There is still one last thing we can do to this to improve performance, which is out of the scope and purpose of this blog but let’s do it anyway. That is get rid of the Key Lookup. You can read more about what those are in my blog here. After adding the required included columns, you can see how much cleaner and faster the simple query now runs. We now run with only 20 logical reads, 0 CPU time and 90ms elapsed timed. That’s a 264.44% performance improvement.
This was a simple tuning exercise to show how implementing coding standards without making sure they are applicable to your data can cost you performance. As a developer you should always know your data. What works with one set of data may not work with another, it is very important to know the differences and code for those.