Hmmm… What’s This?

OK So, I am doing some digging and peaking around again in SQL Server and came across a database option called Date Correlation Optimization Enabled = False. Honestly, I had no clue what it did, so I took it as a learning opportunity to look into it and do a little research. Who knows, it may actual help me solve one of the many problems I run into day to day for clients.

Syntax

ALTER DATABASE DEMO
   SET DATE_CORRELATION_OPTIMIZATION ON;

So, What Does It Do?

According to MSDN – The DATE_CORRELATION_OPTIMIZATION database SET option improves the performance of queries that perform an equi-join between two tables whose date or datetime columns are correlated, and which specify a date restriction in the query predicate.

How many of you read what MSDN says and thinks “wuuuuuttt, English please”? I do.

In English

Basically, it uses a foreign key relationship key between tables in SQL Server to enhance performance of date and date time queries when the dates fall within a certain defined range of each other (correlates). Ok that’s cool, but what’s the big deal? The power really comes in for things like reporting, validation, and data warehouses. With this option turned on, SQL Server maintains statistics between correlated columns and creates improved execution plans that reads less data.

Let’s See It in Action

Consider this, all internet orders that are received have a must send out by date (due date) of 10 days after order is received. Therefore, the OrderDate and DueDate are correlated, related to each other.

Here is a query you would normally run.

SELECT  *
FROM    InternetOrders AS i
        INNER JOIN InternetOrderDetail AS d ON i.InternetOrderID = d.InternetOrderID
WHERE   i.OrderDate BETWEEN '20170801' AND '20170901';

Without DATE_CORRELATION_OPTIMIZATION turn on the optimizer would create a plan just like anything else, however with it set to ON the optimizer can make more granular execution plans.

Here’s how

With each INSERT, UPDATE and DELETE between these two tables SQL Server is gathering statistics which helps the optimizer infer the query to be more like the one below. This is where the power comes in. The optimizer can better narrow down the records it needs to read and therefor returning faster results.

Here is the way SQL interprets the dates now that correlation is turned on and statistics are being gathered. It based on those statistics it can now infer that each DueDate is exactly 10 days after the OrderDate.

SELECT  *
FROM    InternetOrders AS i,
        InternetOrderDetail AS d
WHERE   i.InternetOrderID = d.InternetOrderID
        AND i.OrderDate BETWEEN '8/1/17' AND '8/15/17'
        AND d.DueDate BETWEEN CAST('20170801' AS DATETIME) + 10
                      AND     CAST('20170901' AS DATETIME) + 10;

Depending on the number of records in the two tables this can be a VERY significant decrease to execution times.

The Caution

You should not enable DATE_CORRELATION_OPTIMIZATION in update-intensive database environments. SQL Server keeps all the correlation information in statistics form, this means with every INSERT, UPDATE and DELETE you gain additional overhead.

As always, be sure to test it before you use it in production.

Share

One Response

  1. Nice, I’ve never seen that setting but have datamart that might benefit from this setting.
    Thanks for sharing.

Leave a Reply to LouCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?