Thursday, February 11, 2010

Aggregating data over time slots

The purpose of this blog is to highlight two things

1. Generic solution for aggregating data over fixed time slots
2. Illustration of passing table valued parameters in SQL 2008

Quite often we've had occasions where we required to aggregate our data over time slots like average number of calls that came, summary of sales happened etc.This would typically for doing trend analysis over slots for identifying off peak and peak slots. Below given is a generic approach for dealing with such scenarios.
To illustrate this, I've created a function which accepts a table as a parameter. Table valued parameters are new feature that is introduced from SQL 2008 onwards.For creating a table valued parameter we need to first define a user defined table type in our db. For example scenario I'm creating a table type with structure as below

CREATE TYPE dbo.LogTable AS TABLE
(
ID int IDENTITY(1,1) NOT NULL,
LogTime datetime NOT NULL DEFAULT GETDATE(),
Value int,
PRIMARY KEY (ID)
)
GO

Consider the scenario where we have sales data with time and we require analysis of sales summary over fixed slots of time which we need to determine dynamically.
This can represented by the below sample data. Please note the type specified for the variable. It represents an instance of table type we created earlier.

DECLARE @MyTable LogTable

INSERT @MyTable (Value,LogTime)
SELECT FLOOR(100*RAND()+number) ,DATEADD(mi,-number,GETDATE())
FROM master..spt_values
where [type]='p'
and number BETWEEN 1 AND 10
SELECT * FROM @MyTable ORDER BY LogTime


Now create a function like below

IF EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_NAME ='GetSlotTotals' AND ROUTINE_SCHEMA = 'dbo' AND ROUTINE_TYPE='FUNCTION')
DROP FUNCTION dbo.GetSlotTotals
GO
CREATE FUNCTION dbo.GetSlotTotals
(
@InputTable LogTable READONLY,
@SlotWidth int
)
RETURNS @RESULTS Table
(
ID int IDENTITY(1,1),
SlotStart datetime,
SlotEnd datetime,
ValSum int
)
AS
BEGIN


INSERT INTO @RESULTS (SlotStart,SlotEnd,ValSum)
SELECT DATEADD(mi,DATEDIFF(mi,0,LogTime)/@SlotWidth*@SlotWidth,0) As Start , DATEADD(mi,(DATEDIFF(mi,0,LogTime)/@SlotWidth+1)*@SlotWidth-1,0) AS [End], SUM(Value)
FROM @InputTable
GROUP BY DATEADD(mi,DATEDIFF(mi,0,LogTime)/@SlotWidth*@SlotWidth,0), DATEADD(mi,(DATEDIFF(mi,0,LogTime)/@SlotWidth+1)*@SlotWidth-1,0)

RETURN
END

The above function consists of two parameters one of table type created earlier to pass actual data table in example and another integer parameter representing span of slot we require

And call it like below for getting each of slot aggregates

DECLARE @MyTable LogTable

INSERT @MyTable (Value,LogTime)
SELECT FLOOR(100*RAND()+number) ,DATEADD(mi,-number,GETDATE())
FROM master..spt_values
where [type]='p'
and number BETWEEN 1 AND 10
SELECT * FROM @MyTable ORDER BY LogTime

SELECT SlotStart,SlotEnd,ValSum FROM dbo.GetSlotTotals(@MyTable,10) ORDER BY SlotStart




SELECT SlotStart,SlotEnd,ValSum FROM dbo.GetSlotTotals(@MyTable,5) ORDER BY SlotStart




SELECT SlotStart,SlotEnd,ValSum FROM dbo.GetSlotTotals(@MyTable,2) ORDER BY SlotStart