Our application includes some search functionality which is pretty complex to deal with.  In a nutshell the user can select multiple values from a couple of lists on the website and use those listed to filter down the rows which are being searched.  These lists are passed into the SQL Server as a couple of XML documents.  We recently had a larger customer call and complain that the search was slow.  I fired up profiler and grabbed the query.  They were right, 6 minutes is a long time for a query to take.

There’s some full text searching going on so it’s never going to really scream as there are 70 million records in one table, 57 million records in another table (with a one to many between them), and another 90 million records in the third table (this is a one to many to the table with 70 million records, and this table has the full text index on it).

Needless to say there was some tuning that I had to do.  The basic jist of the query was…

FROM OPENXML(@hDoc_Computer, ‘//computer’)
WITH (ComputerId INT ‘@id’) a
WHERE a.ComputerId = Application.ComputerId)
FROM OPENXML(@hDoc_Logon, ‘//login’)
WITH (LogonId INT ‘@id’) a
WHERE a.LogonId = Application.LogonId)[/sql]

After getting no where working on indexes and tweaking things here and there (and actually making the query take 16 minutes to run for this customer’s data) I put a couple of table variables in the procedure and loaded those table variables up with the values from the XML Documents.

[sql]DECLARE @Computer TABLE
(ComputerId INT)

(LogonId INT)

SELECT ComputerId
FROM OPENXML(@hDoc_Computer, ‘//Computer’, 2)
WITH (ComputerId INT ‘@ComputerId’)

FROM OPENXML(@hDoc_Logon, ‘//Logon’, 2)
WITH (LogonId INT ‘@LogonId’)


And I changed the WHERE clause to use the table variables instead.

[sql]WHERE EXISTS (SELECT * FROM @Computer a WHERE a.ComputerId = Application.computerId)
AND EXISTS (SELECT * FROM @Logon a WHERE a.LogonId = Application.LogonId)[/sql]

This got my query run time down to about 1 minute with the execution plan showing that ~90% of the time spent is being spent on the full text search. So while I wouldn’t normally consider a query run time on 1 minute to be good, in this case it is. (This particular part of the application also goes out to the file server and uses Microsoft Search service to search millions of files for text string matches so this is now the fastest part of the search process.

Now don’t take this post the wrong way. I love OPENXML, it’s a great tool and I use it all over the place so that we can pass in multiple values in a single variable (all our code has to be able to run on SQL 2005 so table input parameters aren’t an option for me). OPENXML just wasn’t the write tool here, sort of.

I wish we could have found this performance problem in QA, but we just have no way to generate enough data to find these kinds of performance problems. But the problem is fixed and the customer is hopefully happy (for now).



Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?