I have a set of transactions occurring at specific points in time:
CREATE TABLE Transactions (
TransactionDate Date NOT NULL,
TransactionValue Intege
John Gibb posted a fine answer, already accepted, but I wanted to expand on it a bit to:
This slight variation uses a recursive common table expression to establish the set of Dates representing the first of each month on or after from and to dates defined in DateRange. Note the use of the MAXRECURSION option to prevent a stack overflow (!); adjust as necessary to accommodate the maximum number of months expected. Also, consider adding alternative Dates assembly logic to support weeks, quarters, even day-to-day.
with
DateRange(FromDate, ToDate) as (
select
Cast('11/1/2008' as DateTime),
Cast('2/15/2010' as DateTime)
),
Dates(Date) as (
select
Case Day(FromDate)
When 1 Then FromDate
Else DateAdd(month, 1, DateAdd(month, ((Year(FromDate)-1900)*12)+Month(FromDate)-1, 0))
End
from DateRange
union all
select DateAdd(month, 1, Date)
from Dates
where Date < (select ToDate from DateRange)
)
select
d.Date, t.TransactionValue
from Dates d
outer apply (
select top 1 TransactionValue
from Transactions
where TransactionDate <= d.Date
order by TransactionDate desc
) t
option (maxrecursion 120);
To do it in a set-based way, you need sets for all of your data or information. In this case there's the overlooked data of "What months are there?" It's very useful to have a "Calendar" table as well as a "Number" table in databases as utility tables.
Here's a solution using one of these methods. The first bit of code sets up your calendar table. You can fill it using a cursor or manually or whatever and you can limit it to whatever date range is needed for your business (back to 1900-01-01 or just back to 1970-01-01 and as far into the future as you want). You can also add any other columns that are useful for your business.
CREATE TABLE dbo.Calendar
(
date DATETIME NOT NULL,
is_holiday BIT NOT NULL,
CONSTRAINT PK_Calendar PRIMARY KEY CLUSTERED (date)
)
INSERT INTO dbo.Calendar (date, is_holiday) VALUES ('2009-01-01', 1) -- New Year
INSERT INTO dbo.Calendar (date, is_holiday) VALUES ('2009-01-02', 1)
...
Now, using this table your question becomes trivial:
SELECT
CAST(MONTH(date) AS VARCHAR) + '/' + CAST(YEAR(date) AS VARCHAR) AS [Month],
T1.TransactionValue AS [Value]
FROM
dbo.Calendar C
LEFT OUTER JOIN dbo.Transactions T1 ON
T1.TransactionDate <= C.date
LEFT OUTER JOIN dbo.Transactions T2 ON
T2.TransactionDate > T1.TransactionDate AND
T2.TransactionDate <= C.date
WHERE
DAY(C.date) = 1 AND
T2.TransactionDate IS NULL AND
C.date BETWEEN '2009-01-01' AND '2009-12-31' -- You can use whatever range you want
-----Alternative way------
select
d.firstOfMonth,
MONTH(d.firstOfMonth) as Mon,
YEAR(d.firstOfMonth) as Yr,
t.TransactionValue
from (
select
dateadd( month, inMonths - 1, '1/1/2009') as firstOfMonth
from (
values (1), (2), (3), (4), (5), (7), (8), (9), (10), (11), (12)
) Dates(inMonths)
) d
outer apply (
select top 1 TransactionValue
from Transactions
where TransactionDate <= d.firstOfMonth
order by TransactionDate desc
) t
I'd start by building a Numbers table holding sequential integers from 1 to a million or so. They come in really handy once you get the hang of it.
For example, here is how to get the 1st of every month in 2008:
select firstOfMonth = dateadd( month, n - 1, '1/1/2008')
from Numbers
where n <= 12;
Now, you can put that together using OUTER APPLY to find the most recent transaction for each date like so:
with Dates as (
select firstOfMonth = dateadd( month, n - 1, '1/1/2008')
from Numbers
where n <= 12
)
select d.firstOfMonth, t.TransactionValue
from Dates d
outer apply (
select top 1 TransactionValue
from Transactions
where TransactionDate <= d.firstOfMonth
order by TransactionDate desc
) t;
This should give you what you're looking for, but you might have to Google around a little to find the best way to create the Numbers table.
If you do this type of analysis often, you might be interested in this SQL Server function I put together for exactly this purpose:
if exists (select * from dbo.sysobjects where name = 'fn_daterange') drop function fn_daterange;
go
create function fn_daterange
(
@MinDate as datetime,
@MaxDate as datetime,
@intval as datetime
)
returns table
--**************************************************************************
-- Procedure: fn_daterange()
-- Author: Ron Savage
-- Date: 12/16/2008
--
-- Description:
-- This function takes a starting and ending date and an interval, then
-- returns a table of all the dates in that range at the specified interval.
--
-- Change History:
-- Date Init. Description
-- 12/16/2008 RS Created.
-- **************************************************************************
as
return
WITH times (startdate, enddate, intervl) AS
(
SELECT @MinDate as startdate, @MinDate + @intval - .0000001 as enddate, @intval as intervl
UNION ALL
SELECT startdate + intervl as startdate, enddate + intervl as enddate, intervl as intervl
FROM times
WHERE startdate + intervl <= @MaxDate
)
select startdate, enddate from times;
go
it was an answer to this question, which also has some sample output from it.
I don't have access to BOL from my phone so this is a rough guide...
First, you need to generate the missing rows for the months you have no data. You can either use a OUTER join to a fixed table or temp table with the timespan you want or from a programmatically created dataset (stored proc or suchlike)
Second, you should look at the new SQL 2008 'analytic' functions, like MAX(value) OVER ( partition clause ) to get the previous value.
(I KNOW Oracle can do this 'cause I needed it to calculate compounded interest calcs between transaction dates - same problem really)
Hope this points you in the right direction...
(Avoid throwing it into a temp table and cursoring over it. Too crude!!!)