All to often when helping people look at query performance problems I’ll ask them to look at the execution plan, and see what it says. Most query performance problems can be resolved simply by looking at the execution plan and seeing where you need to add an index.
All to often I (and others) then get asked, how to I look at the execution plan, and what does it mean?
The first part of the question is pretty easy.
The setting to view the execution plan must be enabled before you run the command. If you need to get the execution plan after you have run the command you need to enable the setting, then rerun the command. (If you are using SQL 2005 or later you can query the system for the execution plan fairly easily, but I’ll cover that in another post later on.)
In Query Analyzer (SQL 7 / 2000) click on the little drop down to the left of the database drop down menu, and click on Show Execution Plan; or press <Ctrl> K which is the hot key to show the execution plan. You can also click on the Query drop down menu, and select “Show Execution Plan” from the menu list, or you can right click on the query pain and select the “Show Execution Plan” from the context menu. All these operations will have the same effect.
In Management Studio (SQL 2005 / 2008) click on the Include Actual Execution Plan button. It is sort of in the middle of the button menu by default. It is the one which I have depressed below with the two blue boxes and the one green box. There is a similar button which has a little cartoon dialog as well. That is not the correct button, that is the button to show the estimated execution plan, not the actual execution plan. You can also get the Execution Plan by clicking on the Query Drop down menu and selecting the “Include Actual Execution Plan” option; or by pressing the <Ctrl> M hot key (I have no idea why it was changed).
Now, put the query in question in the query window, and run the query. Once the query has completed (it may take several seconds longer than normal to run because of the execution plan being returned) you will get an extra tab returned to you. (This will look basically the same in Query Analyzer and Management Studio, so I have only included the SSMS screenshot.)
The Second part is a bit trickier, and requires a bit more expertise.
When reading the execution plan, start at the upper right, and move to the left. When you come to a branch, follow that branch to the end, and start reading from right to left. Repeat until you get to the single icon on the upper left of the execution plan.
You can mouse over the icons to see much more information about the execution plan object. In the case of our execution plan we see that the SQL Server is doing an Index Scan against the Address.IX_Address_AddressLine1_AddressLine2_City_StateProvinceID_PostalCode index. After SQL completes that Index Scan it does an Index Scan against the EmployeeAddress.AK_EmployeeAddress_rowguid index. It then does a Hash Match of the results of those two scans to get a single recordset to deal with. From here SQL does an Clustered Index Seek against the Employee.PK_Employee_EmployeeId index, hollowed by a Nested Loop to join the output of the prior Hash Match, and the results from this Clustered Index Seek together to get a single recordset which is then returned to the client application.
In order to optimize this query, I want to turn the Index Scans into Index Seeks. To do this we create two indexes. The first on the EmployeeAddress table, and the second on the Address table.
CREATE INDEX AE_EmployeeId_AddressId ON HumanResources.EmployeeAddress (AddressID, EmployeeID) GO CREATE INDEX A_City_AddressId ON Person.Address (City, AddressID) INCLUDE (AddressLine1, PostalCode) GO
After we add these indexes we can rerun out query, and see that the new execution plan shows seeks on all three tables now.
Another piece of the query which has been optimized is that we are now no longer using a Hash Match table. A Hash Match table takes all the records from the top input and creates a hash table. Each row from the bottom input scans the hash table looking for matches and outputting the matching rows. When we add the clustered indexes we use a Nested Loops process which allows the SQL Server to table the rows from the top input, and scan the bottom input outputting the matching rows.
While this is a fairly simple query that we are optimizing, it should give you the right idea to start working on on your queries. If you are working on optimizing a stored procedure the process is exactly the same, but for each query in the stored procedure it will be outputted in the execution plan. You will need to read through each queries execution plan looking for the one which needs to be optimized.
If you are doing a query without a filter of any sort then you will pretty must always do either a Table Scan or an Index Scan as SQL has to return all the rows from the table or index. Also when returning all rows via SELECT * you will also probably get a scan instead of a seek. This is one of the reasons that it is recommended that you not use SELECT * in your code.
You should really list the code for the original scanning query so that students can understand where you got the terms for the indexes.
The query is shown in the Screenshots.