Schema Design Changes shouldn’t just be done once

Just because you did a schema design change before doesn’t mean that you shouldn’t look into doing another one.  A while back, before I started working with Phreesia, they decided to fix some performance problems by federating some data.  The data is question is basically a large named value pair table which holds survey answers.  The performance problem that they were trying to solve was that they allow people to fill out the survey more than once, and when they change an answer only the changed values are saved.  Figuring out the newest set of answers became a very expensive query to run.  Because of this, and to make index rebuilding faster it was decided to federate the data across multiple tables basically named answer_1, answer_2, answer_3, etc.

This was working fine for a good year or two, but then the system grew.  As the database got more data loaded into the tables the number of tables that were a member of the federation grew to 125.  There was a view which was used for reading data which did a UNION ALL between the 125 tables.  Querying this view required massive amounts of memory just for storing the execution plans, a whole 8 Megs per execution plan that hit the view.  And there were two of these views.  This is because the blob data was being stored in another set of federated tables.  The first view was just the answer_{n} table and the other view joined all the answer_{n} tables to the tables holding the blob data called answer_lob_{n}.  The way the tables were designed was that if the meta data for the lob data was in answer_1 then the actual lob data was in answer_lob_1 and if the meta data was in answer_2 then the lob data was stored in answer_lob_2.  This made the definition of the two views to look something like this.

[sql]CREATE VIEW basedata
AS
SELECT *
FROM answer_1
UNION ALL
SELECT *
FROM answer_2
UNION ALL

GO
CREATE VIEW lobdata
AS
SELECT *
FROM answer_1
JOIN answer_lob_1 on answer_1.AnswerId = answer_lob_1.AnswerId
UNION ALL
SELECT *
FROM answer_2
JOIN answer_lob_2 on answer_2.AnswerId = answer_lob_2.AnswerId
UNION ALL

GO
[/sql]
Now when monitoring this server the CPU was running at about 30% during business hours and the PFE was running about 21k seconds (about 5.8 hours). The server has 24 cores in it running at about 2.8Ghz and 256 Gigs of memory.

Working with the development team we decided to move the data from the answer_{n} tables into a single table called answer and the data from the answer_lob_{n} tables into a single table called answer_lob. Making this change on this system was actually pretty easy. Thankfully 100% of the access to these tables was done through stored procedures within the database. Some of the procedures were using dynamic SQL, such as to insert the data so that the data would get in the correct table, but there weren’t all that may stored procedures in the database so there weren’t that many to deal with.

When we made this schema change there was another very important change that we made. We added a bit column to the answer table which specified identified the row as being the most current version of the person’s answer. This would then allow us to created filtered indexes on the table that had this bit column in the where clause so that we could create VERY small indexes when compared to the total size of the table.

Because of the changes in the schema, removing the views, ditching all the dynamic SQL, and filtering the indexes we got some major performance changes. The CPU on the SQL Server went from 30% to 5% during normal business hours (when we made the schema change, this was the only change that we released during that release so that we could be sure that there were no problems introduced by the change). On the PLE metric I mentioned above we went from a PLE of 21k to a massive 35k which is 9.7 hours. This is a major improvement on both metrics.

Now the big question is, why didn’t they make the same changes that I had them make before? The answer is that a lot of the SQL Server features that we used (specifically the filtered indexes) weren’t available before because back then the SQL Server was running on SQL Server 2005. One of the changes which I made to the system in order to get the prepped for these schema changes included upgrading to SQL Server 2008 R2 so that we could take advantage of filtered indexes and “optimize for ad hoc workloads” (there’s a lot of dynamic SQL as the developers like to use linq for a lot of the data access for smaller parts of the system), just not for these specific tables.

As you can hopefully see from this example, that just because schema changes were made by someone in the past (these changes were made by some VERY capable consultants) when new technology comes out that you now have available to you it can be really worth it to make changes to the schema to take advantage of this new technology. With these changes that have now been made this system can grow about 19-20x the current load before upgrading is required were with the old schema and performance the system could only grow 2-3x the current load. And that means a much longer life for the server and much less money needing to be spent upgrading the server. Also if we had waited until the server had been upgraded to make the schema changes moving the data from the old tables into the new table would have taken weeks or months instead of the 10 days or so that it took to migrate the data. You see the new answer table has 156 Gigs (500 million rows) of data plus 500 Gigs of indexes and the answer_lob table has 455 Gigs of data (believe it or not this table has only 1.5 Gigs of non-clustered indexes on it).

If you have been putting off doing a schema redesign even after upgrading your database application to a new version of SQL Server (or what ever your database platform is) it might be a good time to look at the system and see what changes you can make.

Denny

Share

One Response

  1. Great post on the benefits of database refactoring! Redesign to take advantage of filtered indexes… Now I’m going to have to go think about where I might have an opportunity for a similar solution. Thanks for putting this out there. –Audrey

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?