Dropping Unused Indexes on an Azure SQL Database

Index Maintenance is an important aspect of database health. Above and beyond regular index rebuilds and reorganizations it is important to understand the usage of the indexes on your database. Cleaning up unused indexes can save a lot of overhead on Insert/Update/Delete operations. To achieve that goal, I typically run a script like the one shown below and check out whether or not an index has had any seeks or scans against it as a starting point in my cleanup regiment.

SELECT d.name,

       OBJECT_NAME(i.[object_id]) AS [ObjectName],

       i.[name] AS [IndexName],

       s.user_seeks,

       s.user_scans

FROM sys.indexes AS i

    LEFT OUTER JOIN sys.dm_db_index_usage_stats AS s

        ON i.[object_id] = s.[object_id]

           AND i.index_id = s.index_id

           AND s.database_id = DB_ID()

    JOIN sys.databases d

        ON d.database_id = s.database_id

WHERE OBJECTPROPERTY(i.[object_id], 'IsUserTable') = 1

ORDER BY s.user_seeks + s.user_scans + s.user_lookups DESC

OPTION (RECOMPILE);

Azure SQL Database

If the index has not shown any utilization I investigate to determine if it is one that can be removed. However, this week something caught my attention. I was looking at a client’s indexes and noted the values for these were not as high as I would have expected. I know that these index statistics are reset upon every SQL Server Service restart, but in this case, I was working on an Azure SQL Database. which got me wondering exactly how that worked. With an Azure Virtual Machine or an on Prem SQL Server instance this is easy to figure out. But with an Azure SQL Database we do not have control over when restarts are done, and what about the Serverless offering (which pauses unutilized databases to reduce costs), how do those behave?  I really want to make sure before I remove any indexes from a database that I am examining the best data possible to make that decision. So, I did some digging.

What I found is that in Azure SQL Database, the Index statistics are reinitialized when the instance hosting your database or elastic pool is restarted. Since this occurs out of the database administrator’s hands it’s important that you run a query like the below to see the instance start time before making any determinations on whether to drop indexes or not.

SELECT sqlserver_start_time

FROM sys.dm_os_sys_info;

Serverless

As we know, a paused serverless database does not have a running instance of the database engine (it is just database files in storage), so index stats are also reinitialized every time a serverless database is resumed. This presents a big problem because I have a server that regularly pauses. There is no way I can use these index statistics to determine much of anything in regard to index cleanup and maintenance. Thus, I am writing this blog to make sure others are aware of this as well.

Using the Serverless option for your Azure SQL Database is a great way to save on resources, but it is very important that you know any drawbacks that it may have such as this. In the meantime, I would suggest trying to persist this data using DMV’s and writing it out to a table. I have not done this yet and not sure how well it will work, but it is worth a try.

Share

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?