Dates can easily be the hardest datatype to work with.

The datetime data type can be one of the hardest to work with when it comes to index optimization.  Most queries that use a datetime data type for filtering (part of the where clause) only want to match the date portion of the value.  Most people handle this via a convert function around the date column.  This causes the index that you create to become useless as the convert function causes the index to be scanned not seeked.

You can see this yourself by creating a table, with a date field, then stuff some records into that table.

[sql]CREATE TABLE sample_table
(id INT IDENTITY(1,1) PRIMARY KEY,
dt DATETIME)
go
CREATE INDEX ix_sample_table ON dbo.sample_table
(dt)
go
INSERT INTO sample_table
(dt)
SELECT ‘1/1/2000 15:00:00’
UNION
SELECT ‘1/2/2000 12:00:00’
UNION
SELECT ‘1/3/2000 08:00:00′
go[/sql]

You can now query the dt column and use the convert function. When you view the execution plan you’ll see that an index scan is being used.
[sql]SELECT id, dt
FROM sample_table
WHERE CONVERT(varchar(10), dt, 101) = ’01/01/2000′[/sql]

But it you use the BETWEEN operator instead of the = operator you’ll see the engine using an index seek instead of the index scan.

[sql]SELECT id, dt
FROM sample_table
WHERE dt BETWEEN ’01/01/2000′ AND ’01/02/2000′[/sql]

Now the same thing applies when using other functions to the datetime column in the table. For example if you were to have a system where people need to search based on the timezone where the data is stored in UTC time, but searching needs to happen based on the users time zone.

The easiest way to do this would be to put a dateadd function around the dt field. However even when using the BETWEEN operator you still use an index scan, instead of the seek. In this case I’m searching against the pacific timezone.

[sql]SELECT id, dt
FROM sample_table
WHERE dateadd(mi, -420, dt) BETWEEN ’01/01/2000′ AND ’01/02/2000′[/sql]

Now, if you switch the -420 to 420, and put the DATEADD function around the dates in the BETWEEN operator the index seek will be preserved. The query looks a little strange, but that’s ok. It’s the end result that we are looking for here.

[sql]SELECT id, dt
FROM sample_table
WHERE dt BETWEEN dateadd(mi, 420, ’01/01/2000′) AND dateadd(mi, 420, ’01/02/2000’)[/sql]

Now if you have the UtcOffset that you want to use stored in another table such as an Account table. We’ll recreate the sample_table to go with this.

[sql]CREATE TABLE sample_table
(id INT IDENTITY(1,1) PRIMARY KEY,
dt DATETIME,
AccountId int)

CREATE TABLE Account
(AccountId INT PRIMARY KEY,
UtcOffset INT)
go
CREATE INDEX ix_sample_table ON dbo.sample_table
(dt)
INCLUDE (AccountId)
go
INSERT INTO sample_table
(dt, AccountId)
SELECT ‘1/1/2000 15:00:00’, 1
UNION
SELECT ‘1/2/2000 12:00:00’, 2
UNION
SELECT ‘1/3/2000 08:00:00′, 1

INSERT INTO Account
SELECT 1, -420
UNION
SELECT 2, -300[/sql]

In order to make this work, you have to take the UtcOffset value *-1 so that the date is adjusted in the correct direction to make the query work.

[sql]SELECT id, dt
FROM sample_table
JOIN Account on sample_table.AccountId = Account.AccountId
WHERE dt BETWEEN dateadd(mi, Account.UtcOffset*-1, ’01/01/2000′) AND dateadd(mi, Account.UtcOffset*-1, ’01/02/2000’)
[/sql]

(Ignore the Clustered Index Scan on the Account table, correcting that isn’t the goal of this post.

Hopefully you find this technique useful.

Denny

Share

6 Responses

  1. [B]WHERE dt BETWEEN ‘01/01/2000′ AND ‘01/02/2000′[/B]

    You need to be careful when using between and dates. That where clause will return rows from the table where the date is 01/02/2000 and the time is midnight. When working with dates, I prefer to use multiple conditions, like this…

    [B]WHERE dt >= ‘01/01/2000′ AND dt = for the start range and < for the end.

  2. This is true, BETWEEN is inclusive, so to use BETWEEN correctly you’d want to use

    [CODE]WHERE dt BETWEEN ‘01/01/2000′ AND dateadd(ms, -3, ‘01/02/2000′)[/CODE]

    Unless you were using SQL 2008’s new data types

    [CODE]WHERE dt BETWEEN ‘01/01/2000′ AND dateadd(ns, -1, ‘01/02/2000′)[/CODE]

  3. In my case i need to compare date portion alone from two different table’s datetime fields. I used convert in both sides of the comparision and this join itself resulting in 53 % of the overall cost of the query and results in performance degration. How to tune this query and can we use a syntax similar to this
    tab1.col1 between tab2.col1 and tab2.col1. Please advise asap. thanks a lot.

  4. You should be able to do something like this without two much overhead.

    [CODE]WHERE tab1.col1 BETWEEN CAST(convert(varchar(10, tab2.col1, 101) as datetime) and dateadd(ms, -3, dateadd(dd, 1, cast(convert(varchar(10), tab2.col1, 101) as datetime)))[/CODE]

    Check my Parens, I think they are correct, but I’m not sure.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?