Tuesday, December 11, 2012

Different ways to implement date range comparison logic in Transact SQL


There are multiple ways of implementing date range comparisons in Transact SQL. The purpose of this post is to explain one of the efficient ways by which we can implement date range comparison. The impetus for this post was the fact that of late, there have been lots of occasions where I've been tasked with optimizing code and have come across lots of inefficient ways for doing date comparison.
Now lets consider an example scenario where we require data from a table to be filtered based on date range ie. get data for current day. There are various ways by which we can implement this filter like below


1. SELECT *
FROM Table
WHERE CONVERT(varchar(10),dateField,121) = CONVERT(varchar(10),GETDATE(),121)

2. SELECT *
FROM Table
WHERE YEAR(dateField) = YEAR(GETDATE())
AND MONTH(dateField) = MONTH(GETDATE())
AND DAY(dateField) = DAY(GETDATE())

3. SELECT *
FROM Table
WHERE CONVERT(date,dateField) = CONVERT(date,GETDATE())

4. SELECT *
FROM Table
WHERE DATEDIFF(dd,0,dateField)=DATEDIFF(dd,0,GETDATE())

5. SELECT *
FROM Table
WHERE dateField >= DATEADD(dd,DATEDIFF(dd,0,GETDATE()),0)
AND dateField < DATEADD(dd,DATEDIFF(dd,0,GETDATE()),1)

Now lets compare the various methods
Method 1 has two problems. It involves a data type convertion to make value varchar whic strips off the time part and then do comparison between them. The convertion operation is costly and also use of CONVERT function in WHERE will make it Non Sargeable clause which will cause an index if present on dateField to be ignored. This can cause a performance hit especially when dateset is large.

Method 2 involves no conversion but uses integer comparison. The only problem with this is again use of functions in WHERE making them Non Sargeable predicates.

Method 3 is similar to Method 1 with only difference being conversion to date datatype rather than varchar. It has the same issues as Method 1

Method 4 is slightly better as it involves only an integer comparison based on DATEDIFF function. But it still has the issue of predicate being Non Sargeable due to use of functions in WHERE clause field.

Method 5 is an enhancement from Method 4 and relies upon integer sequence of dates. Those who've seen my earlier post on date generation from integers would already know fact that dates are stored as numbers internally in sql server with base date value of 1 Jan 1900 corresponding to 0 and unit value representing a day. This method has additional advantage of not using any functions over search field making it Sargeable.

Of all the above methods I prefer method 5 as it was found to be most effective out of the above in most occasions. The same method can be tweaked to get data over entire month,quarter,year etc.The concept is to make use of integer arithmetic to find period elapsed between required start date and base date and add it from base date to get interval start. Similarly for end date add period elapsed between base date and next period start date and use < operator to get data till closing interval preceding the resultant date.
My suggestion is to use Method 5 wherever possible to implement date comparison logic as it will enable the optimizer to use an index if present on date field which will result in better performance.