calendar tables

An Easter SQL

Posted on Updated on

Since Easter is nearly upon us, I got to thinking once again about how challenging it is to calculate the day that Easter Sunday falls on.  Since Calendar tables and their step-children Holiday tables are so interesting I thought I might write a sequel to my earlier blog link in this sentence.

There are at least three holidays (not necessarily business holidays) that are related to the calculation of Easter Sunday.  These are:

  • Good Friday is the Christian holiday that is the Friday just prior to Easter Sunday.
  • Ash Wednesday is another Christian holiday (the first day of Lent) that falls precisely 46 days prior to Easter Sunday.
  • Mardi Gras (a.k.a. Fat Tuesday) is a lesser known holiday celebrated by the more robust partyers among us.  I have a special affinity for it as I lived in New Orleans for a few years.

We will show you how to calculate them all and use that method to populate a Holidays lookup table so hopefully you’ll never have to do it again.  Happy Easter!

A Calendar and a Holiday Table

Discussions of Calendar tables in T-SQL are ubiquitous (here’s one by Todd Fifield), differing only in the columns actually stored in them.   We’ll draw upon the one we created in our blog to show you how easy it is here.

-- Create a Calendar table
SELECT [Date]=ISNULL([Date], 0), [Year]=ISNULL([Year], 0), [YrNN]=ISNULL([YrNN], 0)
    ,[YYYYMM]=ISNULL([YYYYMM], 0), [BuddhaYr], [Month], [Day]=ISNULL([Day], 0)
    ,[WkDNo], [WkDName], [WkDName2], [WkDName3], [JulDay], [JulWk]
    ,[WkNo], [Qtr], [Last], [LdOfMo], [LDtOfMo]
INTO dbo.Calendar 
-- This function can be obtained here: GenerateCalendar FUNCTION
FROM dbo.GenerateCalendar('1990-01-01', 65536);

-- Change column types to be NOT NULL so we can index them
--
-- Note: not needed thanks to the neat trick with ISNULL above
--ALTER TABLE dbo.Calendar ALTER COLUMN [Date] DATETIME NOT NULL;
--ALTER TABLE dbo.Calendar ALTER COLUMN [Year] INT NOT NULL;
--ALTER TABLE dbo.Calendar ALTER COLUMN [Month] INT NOT NULL;
--ALTER TABLE dbo.Calendar ALTER COLUMN [YYYYMM] INT NOT NULL;
--ALTER TABLE dbo.Calendar ALTER COLUMN [Day] INT NOT NULL;
GO

-- Build some representative INDEXes
ALTER TABLE dbo.Calendar 
ADD CONSTRAINT Cal_pk PRIMARY KEY([Date]);

ALTER TABLE dbo.Calendar 
ADD CONSTRAINT Cal_ix1 UNIQUE NONCLUSTERED([Year], [Month], [Day]);

ALTER TABLE dbo.Calendar 
ADD CONSTRAINT Cal_ix2 UNIQUE NONCLUSTERED([YYYYMM], [Day]);

SELECT TOP 10 [Date], [YYYYMM], [WkDName2], [WkNo], [Last]
FROM dbo.Calendar;

-- Now create a Holidays table, from the same blog
CREATE TABLE dbo.Holidays
(
    FromDate    DATETIME PRIMARY KEY
    ,ToDate     DATETIME
    ,Holiday    VARCHAR(100)
);

We’ve created a Holidays table that we’ll be using to further our sequel to Easter.  There are also a few other things in the above Data Definition Language (DDL) that are worthy of mention.

  • The GenerateCalendar FUNCTION is an in-line, Table Valued Function (iTVF) that appears in my earlier blog (and also in the resources file attached at the end of this article), which can be used in lieu of a Calendar table should you prefer to do so.  Once again, my thanks go out to SQL MVP Jeff Moden for his help optimizing the performance of the calculations done in that FUNCTION.
  • We’ve used a neat trick with ISNULL to set the NULL attribute of the column in the table we selected INTO (when Calendar gets CREATEd) to NOT NULL, so that we don’t need to run the subsequent ALTER statements on those columns.  I included those ALTER statements for your information.  Thanks again to Jeff Moden for showing me that trick (with honorable mention to SQL MCM Wayne Sheffield (MCM=Microsoft Certified Master) who came along just a few days later with the same information).

Since Easter Sunday is so diabolically difficult to calculate, many reputable sources have created look-up tables such as this one.  Drawing upon that source, we can create the dates for Easter Sunday from 1990 through 2020 in our Holidays table using the following script.

WITH EasterDates (d) AS
(
    SELECT '1990-04-15' UNION ALL SELECT '1991-03-31' UNION ALL SELECT '1992-04-19'
    UNION ALL SELECT '1993-04-11' UNION ALL SELECT '1994-04-03' 
    UNION ALL SELECT '1995-04-16' UNION ALL SELECT '1996-04-07' 
    UNION ALL SELECT '1997-03-30' UNION ALL SELECT '1998-04-12'
    UNION ALL SELECT '1999-04-04' UNION ALL SELECT '2000-04-23' 
    UNION ALL SELECT '2001-04-15' UNION ALL SELECT '2002-03-31' 
    UNION ALL SELECT '2003-04-20' UNION ALL SELECT '2004-04-11'
    UNION ALL SELECT '2005-03-27' UNION ALL SELECT '2006-04-16' 
    UNION ALL SELECT '2007-04-08' UNION ALL SELECT '2008-03-23' 
    UNION ALL SELECT '2009-04-12' UNION ALL SELECT '2010-04-04'
    UNION ALL SELECT '2011-04-24' UNION ALL SELECT '2012-04-08' 
    UNION ALL SELECT '2013-03-31' UNION ALL SELECT '2014-04-20' 
    UNION ALL SELECT '2015-04-05' UNION ALL SELECT '2016-03-27'
    UNION ALL SELECT '2017-04-16' UNION ALL SELECT '2018-04-01' 
    UNION ALL SELECT '2019-04-21' UNION ALL SELECT '2020-04-12'
)
INSERT INTO dbo.Holidays (FromDate, ToDate, Holiday)
SELECT FromDate=d, ToDate=d, Holiday='Easter Sunday ' + CAST(YEAR(d) AS CHAR(4))
FROM EasterDates;

Before we add Good Friday, Ash Wednesday and Fat Tuesday to our Holiday table, we’d like to check that the introductory description at least results in the correct day of the week on which the holiday falls.  We can use our Calendar table for this.

SELECT FromDate = a.FromDate+b.[days]
    ,ToDate     = a.FromDate+b.[days]
    ,b.holiday
    ,c.WkDName
FROM dbo.Holidays a
CROSS APPLY
(
    VALUES(-2, 'Good Friday ' + CAST(YEAR(FromDate) AS CHAR(4)))
        ,(-46, 'Ash Wednesday ' + CAST(YEAR(FromDate) AS CHAR(4)))
        ,(-47, 'Mardi Gras (Fat Tuesday) ' + CAST(YEAR(FromDate) AS CHAR(4)))
) b ([days], holiday)
JOIN dbo.Calendar c ON c.[Date] = a.FromDate+b.[days] 
WHERE a.Holiday LIKE 'Easter Sunday%';

This produces results like this (only the first two years are shown):

FromDate                ToDate                  holiday                        WkDName
1990-04-13 00:00:00.000 1990-04-13 00:00:00.000 Good Friday 1990               Friday   
1990-02-28 00:00:00.000 1990-02-28 00:00:00.000 Ash Wednesday 1990             Wednesday
1990-02-27 00:00:00.000 1990-02-27 00:00:00.000 Mardi Gras (Fat Tuesday) 1990  Tuesday  
1991-03-29 00:00:00.000 1991-03-29 00:00:00.000 Good Friday 1991               Friday   
1991-02-13 00:00:00.000 1991-02-13 00:00:00.000 Ash Wednesday 1991             Wednesday
1991-02-12 00:00:00.000 1991-02-12 00:00:00.000 Mardi Gras (Fat Tuesday) 1991  Tuesday

So now we can be confident to insert those holidays into our Holidays table.

INSERT INTO dbo.Holidays (FromDate, ToDate, Holiday)
SELECT FromDate = a.FromDate+b.[days]
    ,ToDate     = a.FromDate+b.[days]
    ,b.holiday
FROM dbo.Holidays a
CROSS APPLY
(
    VALUES(-2, 'Good Friday ' + CAST(YEAR(FromDate) AS CHAR(4)))
        ,(-46, 'Ash Wednesday ' + CAST(YEAR(FromDate) AS CHAR(4)))
        ,(-47, 'Mardi Gras (Fat Tuesday) ' + CAST(YEAR(FromDate) AS CHAR(4)))
) b ([days], holiday)
WHERE a.Holiday LIKE 'Easter Sunday%';

A couple of notes on the data types used in our Calendar and Holidays tables:

  • The Date column of the Calendar table was chosen to be DATETIME to retain compatibility with SQL 2005 (corresponding to the GenerateCalendar FUNCTION).  You could just as easily use a DATE datatype if you’re running SQL 2008 or later.
  • The FromDate/ToDate data types are also both DATETIME.  Here it could be considered important to keep the DATETIME data type to record cases where your business offers half-day holidays.  For example, I’ve seen this for Christmas Eve and New Year’s Eve.
  • You can also define a different FromDate than ToDate in the Holidays table to indicate a holiday that spans several days, like the Thailand Songkran holiday example from my blog.
“Since SQL is a database language, we prefer to do look ups and not calculations.” 
    –- Joe Celko (prolific author of books on things SQL)

There you go Joe, we’ve honored your preference!  Now we can do lookups of not just Easter Sunday, but also Good Friday, Ash Wednesday and Mardi Gras, the latter potentially being essential to planning your next trip to New Orleans.

The Calculation of Easter Sunday

A somewhat accurate description of the calculation of Easter Sunday is:

“The Sunday following the full Moon which falls on or after the equinox (the Paschal full moon) will give the lawful Easter.”
    –- From the Wikipedia link on Easter Sunday above

I say “somewhat accurate” because there’s all sorts of ifs, buts and caveats in that description, which you can read about in the Wikipedia link, however it will suffice for the time being.

The formal description of the calculation of Easter is known as Computus (no relation to Locutus of Borg), and is provided by Wikipedia at the link shown.  We shall observe this calculation in what follows.  While the formulas given are indicated to be valid for the years 1900 to 2199, since our Calendar table has been populated with years 1990-2169 we’ll limit our calculations of Easter Sunday to just those the years.  Years 1990 through 2020 will be checked against the lookup table we’ve already provided.

Of course, this calculation appears often enough in the literature and implemented in T-SQL (for example: here and here), often using methods I’d prefer to avoid due to performance.  Since our personal mantra is: “No Loops.  No Cursors.  No RBAR. Hoo-uh!” we shall endeavor to live up to that mantra today.  Think about that term “RBAR” and if you’ve never heard of it before, you’ll find it mentioned in the Conclusion to this article.

We’ll start by attempting to replicate the tabular example for calculating the Paschal full moon (PFMD) within the Gregorian calendar shown by Computus on that Wikipedia page.
 photo Gregorian Calendar for PFMD_zpsycqyrika.png
Note that the algorithm described above is a step-by-step or procedural algorithm. Here is where many SQL implementations fail because they don’t try to convert it to one that is set-based. To do this, you must think of each step in the procedural algorithm as a query, and as you proceed from one step to the next, each new step is another outer layer to your set-based query.

This is easier shown that described, so here is that query for 2013-2032.

SELECT [Year], [Y MOD 19], Addendum
    ,PFMD       = CASE WHEN PFMD > 31 THEN PFMD-31 ELSE PFMD END
    ,[Month]    = CASE WHEN PFMD > 31 THEN 'Apr' ELSE 'Mar' END
    ,[MonthNo]  = CASE WHEN PFMD > 31 THEN 4 ELSE 3 END
FROM
(
    SELECT [Year], [Y MOD 19], Addendum
        ,PFMD   = (45-([Y MOD 19]*11)%30+Addendum)
    FROM
    (
        SELECT [Year]
            ,[Y MOD 19] = [Year]%19
            ,Addendum   = CASE [Year]%19 
                            WHEN 5 THEN 29 
                            WHEN 16 THEN 29 
                            WHEN 8 THEN 30 
                            ELSE 0 
                            END 
        FROM dbo.Calendar
        WHERE [Month] = 1 AND [Day] = 1
    ) a
) a
WHERE [Year] BETWEEN 2013 AND 2032;

Note how the Addendum (CASE) implements the first two IFs (second and third rows) of the procedural algorithm. The results from this query for just the first 10 years are:

Year  Y MOD 19  Addendum  PFMD  Month  MonthNo
2013  18        0         27    Mar    3
2014  0         0         14    Apr    4
2015  1         0         3     Apr    4
2016  2         0         23    Mar    3
2017  3         0         11    Apr    4
2018  4         0         31    Mar    3
2019  5         29        18    Apr    4
2020  6         0         8     Apr    4
2021  7         0         28    Mar    3
2022  8         30        16    Apr    4

If we take PFMD and add the first letter of the month column that follows (for example for 2013: 27M) we see exact correspondence with the row from the Computus table (graphic) labeled the “Paschal full moon date,” including for the years not shown in the results set.  So it appears that we have rather simply converted the formula for calculating PFMD from Computus to a SQL script without too much effort.

Now we’ll use our Calendar table yet again, along with a little additional magic:

  • Put the pieces of the Paschal full moon date back together into a date using a formula that includes a couple of calls to the T-SQL built-in function DATEADD.
  • Do a CROSS APPLY (explained in Part 1 and Part 2 by SQL MVP Paul White) to pick up the Sunday after the Paschal full moon, which is Easter Sunday.
WITH CalculatePaschalFullMoon AS
(
    -- Original query to calculate the components of the 
    -- Paschal Full Moon date
    SELECT a.[Year], [Y MOD 19], Addendum
        ,PFMD       = CASE WHEN PFMD > 31 THEN PFMD-31 ELSE PFMD END
        ,[Month]    = CASE WHEN PFMD > 31 THEN 'Apr' ELSE 'Mar' END
        ,[MonthNo]  = CASE WHEN PFMD > 31 THEN 4 ELSE 3 END
        -- Put the date/time pieces together to get a real DATETIME for PFMD
        ,[PFMD2]    =
            DATEADD(day, CASE WHEN PFMD > 31 THEN PFMD-31 ELSE PFMD END-1 -- PFMD
                ,DATEADD(month, CASE WHEN PFMD > 31 THEN 4 ELSE 3 END-1   -- [MonthNo]
                    , a.[Date]))
    FROM
    (
        SELECT [Year], [Y MOD 19], Addendum, [Date]
            ,PFMD   = (45-([Y MOD 19]*11)%30+Addendum)
        FROM
        (
            SELECT [Year]
                ,[Y MOD 19] = [Year]%19
                ,Addendum   = CASE [Year]%19 
                                WHEN 5 THEN 29 
                                WHEN 16 THEN 29 
                                WHEN 8 THEN 30 
                                ELSE 0 
                                END
                ,[Date] 
            FROM dbo.Calendar
            WHERE [Month] = 1 AND [Day] = 1
        ) a
    ) a
)
SELECT a.[Year]
    ,PaschalFullMoon=CAST(a.PFMD2 AS DATE)
    ,EasterSunday
FROM CalculatePaschalFullMoon a
-- Easter Sunday follows the Paschal Full Moon date
-- so pick that up from the Calendar table
CROSS APPLY
(
    SELECT TOP 1 EasterSunday=CAST([Date] AS DATE)
    FROM dbo.Calendar b
    WHERE b.[Date] > a.PFMD2 AND b.WkDName3 = 'Sun'
    ORDER BY [Date]
) b
WHERE [Year] BETWEEN 2013 AND 2032;

 

Now for a quiz on the above code where the lines are bold:

Why do we subtract 1 from each of the CASE statements that appear as the second argument to DATEADD?

We’ll give you the answer to this in the Conclusion section of this article, so stay tuned…

Here are the first ten years of the results set from the last T-SQL query shown above:

Year  PaschalFullMoon  EasterSunday
2013  2013-03-27       2013-03-31
2014  2014-04-14       2014-04-20
2015  2015-04-03       2015-04-05
2016  2016-03-23       2016-03-27
2017  2017-04-11       2017-04-16
2018  2018-03-31       2018-04-01
2019  2019-04-18       2019-04-21
2020  2020-04-08       2020-04-12
2021  2021-03-28       2021-04-04
2022  2022-04-16       2022-04-17

Our Computus Wikipedia page also has the inset table (at right) presenting the Western calculation of Easter Sunday, which corresponds quite nicely to our calculation of Easter Sunday.

That really wasn’t so hard was it? Now comes the packaging into a tool.

Since we like to keep things clean, we’re going to make this wonderful bit of SQL code into an in-line, table-valued function that we can call anytime we need to calculate Easter Sunday for a specific year.

We’ll then continue on to use our tool to calculate the other holidays by using the known number of days offset, just like we did before with our Easter Sunday lookup table, and insert that result into our Holidays table for years that are missing.

 photo Easter Sundays_zpsrruxck4b.png
CREATE FUNCTION CalculateEasterSunday
(
    @Year       INT
)
-- Calculate Easter Sunday for any given year
RETURNS TABLE WITH SCHEMABINDING
AS RETURN

WITH CalculatePaschalFullMoon AS
(
    -- Original query to calculate the components of the 
    -- Paschal Full Moon date
    SELECT a.[Year] --, [Y MOD 19], Addendum
        --,PFMD       = CASE WHEN PFMD > 31 THEN PFMD-31 ELSE PFMD END
        --,[Month]    = CASE WHEN PFMD > 31 THEN 'Apr' ELSE 'Mar' END
        --,[MonthNo]  = CASE WHEN PFMD > 31 THEN 4 ELSE 3 END
        -- Put the date/time pieces together to get a real 
        -- DATETIME for PFMD
        ,[PFMD2]    =
            DATEADD(day, CASE WHEN PFMD > 31 
                                THEN PFMD-31 
                                ELSE PFMD 
                                END-1           -- PFMD
                ,DATEADD(month, CASE 
                                WHEN PFMD > 31 
                                THEN 4 
                                ELSE 3 
                                END-1           -- [MonthNo]
                    , a.[Date]))
    FROM
    (
        SELECT [Year], [Y MOD 19], Addendum, [Date]
            ,PFMD   = (45-([Y MOD 19]*11)%30+Addendum)
        FROM
        (
            SELECT [Year]
                ,[Y MOD 19] = [Year]%19
                ,Addendum   = CASE [Year]%19 
                                WHEN 5 THEN 29 
                                WHEN 16 THEN 29 
                                WHEN 8 THEN 30 
                                ELSE 0 
                                END
                ,[Date] 
            FROM dbo.Calendar
            WHERE [Month] = 1 AND [Day] = 1 AND [Year] = @Year
        ) a
    ) a
)
SELECT a.[Year]
    ,PaschalFullMoon=CAST(a.PFMD2 AS DATE)
    ,EasterSunday
FROM CalculatePaschalFullMoon a
-- Easter Sunday follows the Paschal Full Moon date
-- so pick that up from the Calendar table
CROSS APPLY
(
    SELECT TOP 1 EasterSunday=CAST([Date] AS DATE)
    FROM dbo.Calendar b
    WHERE b.[Date] > a.PFMD2 AND b.WkDName3 = 'Sun'
    ORDER BY [Date]
) b;

You can see where I commented out a few lines of code in the FUNCTION that represent intermediate results that our FUNCTION doesn’t really need to retain. Notice also how the filter on the @Year parameter was moved up into the innermost SELECT from the Calendar table, because when using the FUNCTION you’ll be selecting only one row at a time at that point. Calling the iTVF using CROSS APPLY in-lines the code to make this a non-RBAR query.

Now we can use our FUNCTION to check that the dates we’ve prior stored for Easter Sunday in our Holidays table are correct.

SELECT HolidayFromDate  = CAST(a.FromDate AS DATE)
    ,b.EasterSunday
    ,[Check]            = CASE 
                            WHEN a.FromDate = b.EasterSunday 
                            THEN 'MATCHED!' 
                            ELSE 'WRONG!' 
                            END
FROM dbo.Holidays a
CROSS APPLY dbo.CalculateEasterSunday(YEAR(a.FromDate)) b
WHERE Holiday LIKE 'Easter Sunday%';

The first ten years of our results show us all MATCHED! (and so do the others that are not shown).

HolidayFromDate  EasterSunday   Check
1990-04-15       1990-04-15     MATCHED!
1991-03-31       1991-03-31     MATCHED!
1992-04-19       1992-04-19     MATCHED!
1993-04-11       1993-04-11     MATCHED!
1994-04-03       1994-04-03     MATCHED!
1995-04-16       1995-04-16     MATCHED!
1996-04-07       1996-04-07     MATCHED!
1997-03-30       1997-03-30     MATCHED!
1998-04-12       1998-04-12     MATCHED!
1999-04-04       1999-04-04     MATCHED!

So we can be reasonably confident of our Easter Sunday calculation for the years indicated by Computus (1900-2199)!

Using our FUNCTION to Create Lookup Values in our Holidays Table

Now to keep Joe Celko happy (i.e., so we can do lookups instead of calculations), let’s insert the rest of our holidays, including Easter Sunday, Good Friday, Ash Wednesday and Mardi Gras, into our Holidays table, using all the years available to us in our Calendar table.

INSERT dbo.Holidays (FromDate, ToDate, Holiday)
SELECT FromDate, ToDate, holiday
FROM
(
    SELECT FromDate = DATEADD(day, c.days, b.EasterSunday)
        ,ToDate     = DATEADD(day, c.days, b.EasterSunday)
        ,c.holiday
    FROM dbo.Calendar a
    CROSS APPLY dbo.CalculateEasterSunday(a.[Year]) b
    CROSS APPLY
    (
        VALUES (0, 'Easter Sunday ' + CAST(a.[Year] AS CHAR(4)))
            ,(-2, 'Good Friday ' + CAST(a.[Year] AS CHAR(4)))
            ,(-46, 'Ash Wednesday ' + CAST(a.[Year] AS CHAR(4)))
            ,(-47, 'Mardi Gras (Fat Tuesday) ' + CAST(a.[Year] AS CHAR(4)))
    ) c ([days], holiday)
    WHERE a.[Month] = 1 AND a.[Day] = 1
) a
WHERE NOT EXISTS
(
    SELECT 1
    FROM dbo.Holidays x
    WHERE x.FromDate = a.FromDate AND x.ToDate = a.ToDate
);

This added 596 holidays to our table, and I’d guess we’re pretty well set for our lifetimes!

Notice how, in the above query we had to use DATEADD instead of simple arithmetic as we did before when using the DATETIME data type.  This is because simple arithmetic doesn’t work with the DATE data type returned by our CalculateEasterSunday function.

Conclusion and What we Learned

While Joe likes lookups, I find calculations in T-SQL to be quite a bit of fun.  In this article (as in my blog on Calendar tables), we got to do a little of both.

If you went ahead and read my blog on Calendar tables in conjunction with this article, I am pleased because you probably learned a lot about how useful they can be, particularly in conjunction with a Holidays table. In this article (in the FUNCTION) we used our Calendar table to find the next Sunday occurring after our calculated date of the Paschal full moon.

Besides learning about the trivialities of the relationship between Easter, Good Friday, Ash Wednesday, and planning for your next visit to New Orleans for Mardi Gras, we’d like to think that you may have learned a few other things as well:

  • You can use simple arithmetic on a DATETIME data type to add or subtract days, however you must use DATEADD to do the same thing with a DATE data type.
  • You now have efficient tools in your toolbox to generate a calendar table (GenerateCalendar) and CalculateEasterSunday, whenever you have a need to use them.  The latter function is also included in the resources file attached at the end of this article.
  • You learned how to CROSS APPLY VALUES to do an UNPIVOT operation, even though I didn’t specifically mention that was what we were doing when we used it (where we created rows with dates for Good Friday, Ash Wednesday and Mardi Gras).  If you haven’t seen that method before you may want to look into that link as well. That is something that will only work in SQL 2008 (or later), but there is an equivalent construct that will work in SQL 2005.
  • We learned how to use ISNULL on a SELECT/INTO operation, to set the attribute of the column created in the target table to NOT NULL.
  • You learned how to think your way through a rather complex, procedural calculation and convert it into set-based code, by wrapping each successive step in another layer of query.
  • You learned my personal SQL mantra, in case you hadn’t seen that before.

In case you’ve been wondering what RBAR stands for, it means “Row By Agonizing Row” and it is a term coined by Jeff Moden which I fortuitously picked up on before I got to know him.

Oh yes, then there’s the quiz I mentioned. There are two cases:

  • If we’re starting with 01 Jan of a year as our date, and the Paschal full moon month is three (meaning it should fall in Mar), if we add three months to 01 Jan it puts us in Apr. So we subtract one month to get us back to where we need to be.
  • If we’re starting with 01 Mar of a year as our date, and the Paschal full moon day within that month is (say) 14 (meaning it should fall on 14 Mar), if we add 14 days to 01 Mar it puts us in 15 Mar. So we subtract one day to get us back to where we need to be.

After all this, I still have no clue what a Paschal full moon is but what the heck, at least I know I can calculate when it is!  That’s all for today folks.  I hope you found this article useful and just a bit of fun.

Follow me on Twitter: @DwainCSQL

© Copyright Dwain Camps 01 Apr 2015.  All rights reserved.

Resources for An Easter SQL.doc

The Resources above are stored in a Microsoft Word (TM) document. There are two sections:

-- Create GenerateCalendar, Calendar and Holiday Tables and Populate.sql
... page break ...
-- Create CalculateEasterSunday and Add Holidays.sql
Advertisement

Calendar Tables in T-SQL

Posted on

In an earlier blog, we covered a type of auxiliary table (the Tally Table) that can provide a lot of querying flexibility if you have one in your database or construct one in-line to your query. Today we’re going to talk about another: the Calendar table.

The basic concept behind a Calendar table is that each row is a date and the columns in the table represent complex date calculations that otherwise you’d need to perform manually in your query. If the calculations are done beforehand, they can tremendously speed up your query because you can avoid them when you need them.

We’re also going to show you a really neat and fast FUNCTION for generating a Calendar table that you could use instead of having a pre-defined, auxiliary table in your database.

The GenerateCalendar FUNCTION

We’ll start by proposing a utility FUNCTION for performing various complex date calculations all in one shot. Originally I had the idea that this might be nearly as fast as having a separate Calendar table. A friend of mine, SQL MVP Jeff Moden, got wind of what I was doing and he rewrote my version into one that was significantly faster. So without further ado, here is that FUNCTION.

CREATE FUNCTION [dbo].[GenerateCalendar] 
        (
        @FromDate   DATETIME 
        ,@NoDays    INT			
        )
-- Generates a calendar table with sequential day numbering (@FromDate = SeqNo 1).
-- See RETURNS table (comments) for meaning of each column.
-- Notes:       1) Max for NoDays is 65536, which runs in just over 2 seconds.
--
-- Example calls to generate the calendar:
-- 1) Forward for 365 days starting today:
--             DECLARE @Date DATETIME
--             SELECT @Date = GETDATE()
--             SELECT * 
--             FROM dbo.GenerateCalendar(@Date, 365) 
--             ORDER BY SeqNo;
-- 2) Backwards for 365 days back starting today:
--             DECLARE @Date DATETIME
--             SELECT @Date = GETDATE()
--             SELECT * 
--             FROM dbo.GenerateCalendar(@Date, -365) 
--             ORDER BY SeqNo;
-- 3) For only the FromDate:
--             DECLARE @Date DATETIME
--             SELECT @Date = GETDATE()
--             SELECT * 
--             FROM dbo.GenerateCalendar(@Date, 1);
-- 4) Including only the last week days of each month:
--             Note: Seq no in this case are as if all dates were generated
--             DECLARE @Date DATETIME
--             SELECT @Date = GETDATE()
--             SELECT * 
--             FROM dbo.GenerateCalendar(@Date, 365) 
--             WHERE Last = 1 ORDER BY SeqNo;
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN
--===== High speed code provided courtesy of SQL MVP Jeff Moden (idea by Dwain Camps)
--===== Generate sequence numbers from 1 to 65536 (credit to SQL MVP Itzik Ben-Gen)
   WITH  E1(N) AS (SELECT 1 UNION ALL SELECT 1), --2 rows
         E2(N) AS (SELECT 1 FROM E1 a, E1 b),    --4 rows
         E4(N) AS (SELECT 1 FROM E2 a, E2 b),    --16 rows
         E8(N) AS (SELECT 1 FROM E4 a, E4 b),    --256 rows
        E16(N) AS (SELECT 1 FROM E8 a, E8 b),    --65536 rows
   cteTally(N) AS (
SELECT TOP (ABS(@NoDays)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E16)
        -- [SeqNo]=Sequential day number (@FromDate always=1) forward or backwards
 SELECT [SeqNo]     = t.N,
        -- [Date]=Date (with 00:00:00.000 for the time component)                              
        [Date]      = dt.DT,  
        -- [Year]=Four digit year                                  
        [Year]      = dp.YY,
        -- [YrNN]=Two digit year                                    
        [YrNN]      = dp.YY % 100,
        -- [YYYYMM]=Integer YYYYMM (year * 100 + month)                              
        [YYYYMM]    = dp.YY * 100 + dp.MM,
        -- [BuddhaYr]=Year in Buddhist calendar                      
        [BuddhaYr]  = dp.YY + 543, 
        -- [Month]=Month (as an INT)                             
        [Month]     = dp.MM, 
        -- [Day]=Day (as an INT)                                   
        [Day]       = dp.DD,
        -- [WkDNo]=Week day number (based on @@DATEFIRST)                                    
        [WkDNo]     = DATEPART(dw,dt.DT),
        -- Next 3 columns dependent on language setting so may not work for non-English  
        -- [WkDName]=Full name of the week day, e.g., Monday, Tuesday, etc.                     
        [WkDName]   = CONVERT(NCHAR(9),dp.DW), 
        -- [WkDName2]=Two characters for the week day, e.g., Mo, Tu, etc.                 
        [WkDName2]  = CONVERT(NCHAR(2),dp.DW),  
        -- [WkDName3]=Three characters for the week day, e.g., Mon, Tue, etc.                
        [WkDName3]  = CONVERT(NCHAR(3),dp.DW),  
        -- [JulDay]=Julian day (day number of the year)                
        [JulDay]    = dp.DY,
        -- [JulWk]=Week number of the year                                    
        [JulWk]     = dp.DY/7+1,
        -- [WkNo]=Week number                                
        [WkNo]      = dp.DD/7+1,
        -- [Qtr]=Quarter number (of the year)                                
        [Qtr]       = DATEPART(qq,dt.Dt),                       
        -- [Last]=Number the weeks for the month in reverse      
        [Last]      = (DATEPART(dd,dp.LDtOfMo)-dp.DD)/7+1,
        -- [LdOfMo]=Last day of the month                  
        [LdOfMo]    = DATEPART(dd,dp.LDtOfMo),
        -- [LDtOfMo]=Last day of the month as a DATETIME
        [LDtOfMo]   = dp.LDtOfMo                                
   FROM cteTally t
  CROSS APPLY 
  ( --=== Create the date
        SELECT DT = DATEADD(dd,(t.N-1)*SIGN(@NoDays),@FromDate)
  ) dt
  CROSS APPLY 
  ( --=== Create the other parts from the date above using a "cCA"
    -- (Cascading CROSS APPLY (cCA), courtesy of Chris Morris)
        SELECT YY        = DATEPART(yy,dt.DT), 
                MM        = DATEPART(mm,dt.DT), 
                DD        = DATEPART(dd,dt.DT), 
                DW        = DATENAME(dw,dt.DT),
                Dy        = DATEPART(dy,dt.DT),
                LDtOfMo   = DATEADD(mm,DATEDIFF(mm,-1,dt.DT),-1)

  ) dp;

If you’ve read my blog on Tally tables, you’ll recognize the Ben-Gan style in-line Tally table that this FUNCTION uses (which will work in SQL 2005), and that it is limited to generating 65,536 rows (days) for your calendar table. You may also note that it is a schema-bound, in-line Table Valued FUNCTION (iTVF), done that way to get the best possible performance out of it. Let’s see how we would use it to populate a Calendar table starting on 01 Jan 2010.

SELECT [Date], [Year], [YrNN], [YYYYMM], [BuddhaYr], [Month], [Day]
    ,[WkDNo], [WkDName], [WkDName2], [WkDName3], [JulDay], [JulWk]
    ,[WkNo], [Qtr], [Last], [LdOfMo], [LDtOfMo]
INTO dbo.Calendar 
FROM dbo.GenerateCalendar('2010-01-01', 65536);

-- Change column types to be NOT NULL so we can index them
ALTER TABLE dbo.Calendar ALTER COLUMN [Date] DATETIME NOT NULL;
ALTER TABLE dbo.Calendar ALTER COLUMN [Year] INT NOT NULL;
ALTER TABLE dbo.Calendar ALTER COLUMN [Month] INT NOT NULL;
ALTER TABLE dbo.Calendar ALTER COLUMN [YYYYMM] INT NOT NULL;
ALTER TABLE dbo.Calendar ALTER COLUMN [Day] INT NOT NULL;
GO
-- Build some representative INDEXes
ALTER TABLE dbo.Calendar ADD CONSTRAINT Cal_pk PRIMARY KEY([Date]);
ALTER TABLE dbo.Calendar ADD CONSTRAINT Cal_ix1 UNIQUE NONCLUSTERED([Year], [Month], [Day]);
ALTER TABLE dbo.Calendar ADD CONSTRAINT Cal_ix2 UNIQUE NONCLUSTERED([YYYYMM], [Day]);

SELECT TOP 10 [Date], [YYYYMM], [WkDName2], [WkNo], [Last]
FROM dbo.Calendar;

The final SELECT returns selected columns that we’ll use in some examples that follow for just the first few rows of the table. Note that the CLUSTERED INDEX on [Date] makes it pretty likely that these rows will be returned in date sequence. We’ve also included a couple of UNIQUE NONCLUSTERED INDEXes to support the ways we may want to retrieve data from this table.

Date                    YYYYMM   WkDName2 WkNo  Last
2010-01-01 00:00:00.000 201001   Fr       1     5
2010-01-02 00:00:00.000 201001   Sa       1     5
2010-01-03 00:00:00.000 201001   Su       1     5
2010-01-04 00:00:00.000 201001   Mo       1     4
2010-01-05 00:00:00.000 201001   Tu       1     4
2010-01-06 00:00:00.000 201001   We       1     4
2010-01-07 00:00:00.000 201001   Th       2     4
2010-01-08 00:00:00.000 201001   Fr       2     4
2010-01-09 00:00:00.000 201001   Sa       2     4
2010-01-10 00:00:00.000 201001   Su       2     4

The last entry in this calendar table is 2189-06-06, which is probably far enough into the future for any reasonable use. Note also that the columns [Date] and [LDtOfMo] could as easily have been CAST to a DATE data type, assuming we’re working in SQL 2008 or later because we’ll never be interested in the time component.

Simple Cases of Using our Calendar Table

While these are “simple” cases for using our Calendar table, try doing them without one and you might come to the conclusion that they are not so simple.

Suppose we are interested in finding the first Monday of each month in the year 2014. The WkNo makes this really simple.

SELECT [Date]
FROM dbo.Calendar
WHERE [Date] BETWEEN ‘2014-01-01’ AND ‘2014-12-31’ AND
    [WkDName2] = ‘MO’ AND [WkNo] = 1;

Using the [Date] field as we have done ensures this query uses a Clustered Index Seek that returns these results.

Date
2014-01-06 00:00:00.000
2014-02-03 00:00:00.000
2014-03-03 00:00:00.000
2014-05-05 00:00:00.000
2014-06-02 00:00:00.000
2014-08-04 00:00:00.000
2014-09-01 00:00:00.000
2014-10-06 00:00:00.000
2014-11-03 00:00:00.000
2014-12-01 00:00:00.000

The interested reader may confirm that these are all Mondays.

Likewise the [Last] column, which numbers the days backwards as weeks within the month, is designed specifically to address a requirement like “list all of the dates that are the last Friday in a month.”

SELECT [Date]
FROM dbo.Calendar
WHERE [Date] BETWEEN ‘2014-01-01’ AND ‘2014-12-31’ AND
    [WkDName2] = ‘FR’ AND [Last] = 1;

Once again, because of the way we are using the [Date] in the WHERE filter, the result is a Clustered Index Seek, returning these results.

Date
2014-01-31 00:00:00.000
2014-02-28 00:00:00.000
2014-03-28 00:00:00.000
2014-04-25 00:00:00.000
2014-05-30 00:00:00.000
2014-06-27 00:00:00.000
2014-07-25 00:00:00.000
2014-08-29 00:00:00.000
2014-09-26 00:00:00.000
2014-10-31 00:00:00.000
2014-11-28 00:00:00.000
2014-12-26 00:00:00.000

Another requirement that can be a bit challenging is to calculate the last date of a particular month. Consider the query and results below listing the first and last day of each month, and specifically how it returns the leap day for February.

SELECT [Date], [LDtOfMo]
FROM dbo.Calendar
WHERE [Date] BETWEEN '2012-01-01' AND '2012-12-31' AND
    [Day] = 1;
Date                    LDtOfMo
2012-01-01 00:00:00.000 2012-01-31 00:00:00.000
2012-02-01 00:00:00.000 2012-02-29 00:00:00.000
2012-03-01 00:00:00.000 2012-03-31 00:00:00.000
2012-04-01 00:00:00.000 2012-04-30 00:00:00.000
2012-05-01 00:00:00.000 2012-05-31 00:00:00.000
2012-06-01 00:00:00.000 2012-06-30 00:00:00.000
2012-07-01 00:00:00.000 2012-07-31 00:00:00.000
2012-08-01 00:00:00.000 2012-08-31 00:00:00.000
2012-09-01 00:00:00.000 2012-09-30 00:00:00.000
2012-10-01 00:00:00.000 2012-10-31 00:00:00.000
2012-11-01 00:00:00.000 2012-11-30 00:00:00.000
2012-12-01 00:00:00.000 2012-12-31 00:00:00.000

The real utility of having a Calendar table will be clearer from the example described in the next section.

Using the Calendar Table to Fill in a Missing Month

Let’s suppose that you have a table that contains sales orders. We’ll simplify the situation slightly by ignoring the header/detail complexity of a true sales order, and just assume that the total for the order appears on the Sales order’s header.

CREATE TABLE #SalesOrders
(
    SONumber        INT IDENTITY PRIMARY KEY
    ,SODate         DATE
    ,Customer       VARCHAR(20)
    ,SalesAmount    MONEY
);

INSERT INTO #SalesOrders
VALUES('2013-01-15', 'AIS', 25000)
    ,('2013-01-31', 'AIS', 15000)
    ,('2013-03-01', 'AIS', 20000)
    ,('2013-03-02', 'AIS', 12000)
    ,('2013-03-05', 'AIS', 10000)
    ,('2013-01-15', 'DTAC', 25000)
    ,('2013-03-01', 'DTAC', 5000)
    ,('2013-03-15', 'DTAC', 11000);

SELECT *
FROM #SalesOrders;

We can see by inspecting the results that follow, that no sales were made in February.

SONumber  SODate       Customer  SalesAmount
1         2013-01-15   AIS       25000.00
2         2013-01-31   AIS       15000.00
3         2013-03-01   AIS       20000.00
4         2013-03-02   AIS       12000.00
5         2013-03-05   AIS       10000.00
6         2013-01-15   DTAC      25000.00
7         2013-03-01   DTAC      5000.00
8         2013-03-15   DTAC      11000.00

Suppose that our requirement was to list total sales by month for the first quarter of 2013. We can use our Calendar table to fill in the missing month.

SELECT [YYYYMM], Sales=ISNULL(SUM(SalesAmount), 0)
FROM dbo.Calendar a
LEFT JOIN #SalesOrders b ON a.[Date] = DATEADD(month, DATEDIFF(month, 0, SODate), 0)
WHERE [Date] BETWEEN '2013-01-01' AND '2013-03-31' AND
    [Day] = 1
GROUP BY [YYYYMM];

In our Tally tables blog entry, we showed you how you could truncate a DATETIME value to either the day or hour. DATEADD(month, DATEDIFF(month, 0, SODate), 0) simply truncates the DATE to the month (setting the day to the first of the month). It should come as no surprise to our readers that the results from this query exactly matches the requirement that we specified.

YYYYMM Sales
201301 65000.00
201302 0.00
201303 58000.00

We’ll leave it as an exercise for our interested readers to figure out how to list sales by company by month (so that there should be two rows for February each with a value of zero).

Note that just like a Tally table; a Calendar table can also be used to expand a row set if that happens to be what you need to do.

Using the Calendar Table in Conjunction with a Holidays Table

Suppose we have a table of company holidays for 2014 (this company is located in Thailand):

CREATE TABLE dbo.Holidays
(
    FromDate    DATETIME PRIMARY KEY
    ,ToDate     DATETIME
    ,Holiday    VARCHAR(100)
);

INSERT INTO dbo.Holidays
SELECT '2014-01-01', NULL, 'New Year''s Day'
UNION ALL SELECT '2014-02-14', NULL, 'Makha Bucha Day'
UNION ALL SELECT '2014-04-07', NULL, 'Substitution for Chakri Day'
UNION ALL SELECT '2014-04-14', '2014-04-15', 'Songkran Festival'
UNION ALL SELECT '2014-05-01', NULL, 'National Labor Day'
UNION ALL SELECT '2014-05-05', NULL, 'Coronation Day'
UNION ALL SELECT '2014-05-13', NULL, 'Wisakha Bucha Day'
UNION ALL SELECT '2014-07-11', NULL, 'Asarnha Bucha Day'
UNION ALL SELECT '2014-08-12', NULL, 'HM the Queen''s Birthday'
UNION ALL SELECT '2014-10-23', NULL, 'Chulalongkorn Day'
UNION ALL SELECT '2014-12-05', NULL, 'HM the King''s Birthday'
UNION ALL SELECT '2014-12-10', NULL, 'Constitution Day'
UNION ALL SELECT '2014-12-31', NULL, 'New Year''s Eve';

SELECT *
FROM dbo.Holidays;
FromDate     ToDate       Holiday
2014-01-01   NULL         New Year's Day
2014-02-14   NULL         Makha Bucha Day
2014-04-07   NULL         Substitution for Chakri Day
2014-04-14   2014-04-15   Songkran Festival
2014-05-01   NULL         National Labor Day
2014-05-05   NULL         Coronation Day
2014-05-13   NULL         Wisakha Bucha Day
2014-07-11   NULL         Asarnha Bucha Day
2014-08-12   NULL         HM the Queen's Birthday
2014-10-23   NULL         Chulalongkorn Day
2014-12-05   NULL         HM the King's Birthday
2014-12-10   NULL         Constitution Day
2014-12-31   NULL         New Year's Eve

Let’s suppose that our assignment is to calculate pay day for each month in 2014. This company pays its employees on the last day of the month, the Friday before that if the last day of the month occurs on a weekend or the previous work day if the normal pay day happens to be a holiday. Let’s do this in two parts.

Firstly, we know that none of the holidays plus weekend days can span more than seven days, so we just need to find (using the [Last] column of our Calendar table) all of the work days in the last week of the month.

SELECT [Date], [WkDName2], FromDate
FROM dbo.Calendar a
LEFT JOIN dbo.Holidays b ON a.[DATE] BETWEEN FromDate AND ISNULL(ToDate, FromDate) 
WHERE [Date] BETWEEN '2014-01-01' AND '2014-12-31' AND
    [Last] = 1 AND
    -- Remove Saturdays and Sundays
    [WkDName2] NOT IN ('SA', 'SU') AND
    -- Remove holidays
    FromDate IS NULL;

The above query actually generates 72 rows of results so let’s just look at a few of them (for May and December).

Date                     WkDName2 FromDate
<snip>
2014-05-26 00:00:00.000 Mo       NULL
2014-05-27 00:00:00.000 Tu       NULL
2014-05-28 00:00:00.000 We       NULL
2014-05-29 00:00:00.000 Th       NULL
2014-05-30 00:00:00.000 Fr       NULL
<snip>
2014-12-25 00:00:00.000 Th       NULL
2014-12-26 00:00:00.000 Fr       NULL
2014-12-29 00:00:00.000 Mo       NULL
2014-12-30 00:00:00.000 Tu       NULL

It should be no surprise that FromDate is always NULL, because of the WHERE filter. What’s important to note is that the last date shown for May is the Friday (31 May is a Saturday so it was eliminated) and that the last date for December is the 30th (because the New Year’s Eve holiday was eliminated).

It now becomes a relatively simple matter to use ROW_NUMBER() to identify the last working day of the month (note the DESC sort), like this:

SELECT [Date]
FROM
(
    SELECT [Date], [WkDName2], FromDate
        ,rn=ROW_NUMBER() OVER (PARTITION BY [YYYYMM] ORDER BY [Date] DESC) 
    FROM dbo.Calendar a
    LEFT JOIN dbo.Holidays b ON a.[DATE] BETWEEN FromDate AND ISNULL(ToDate, FromDate) 
    WHERE [Date] BETWEEN '2014-01-01' AND '2014-12-31' AND
        [Last] = 1 AND
        -- Remove Saturdays and Sundays
        [WkDName2] NOT IN ('SA', 'SU') AND
        -- Remove holidays
        FromDate IS NULL
) a
WHERE rn=1;

So here is a list of this Thai company’s pay days in 2014:

Date
2014-01-31 00:00:00.000
2014-02-28 00:00:00.000
2014-03-31 00:00:00.000
2014-04-30 00:00:00.000
2014-05-30 00:00:00.000
2014-06-30 00:00:00.000
2014-07-31 00:00:00.000
2014-08-29 00:00:00.000
2014-09-30 00:00:00.000
2014-10-31 00:00:00.000
2014-11-28 00:00:00.000
2014-12-30 00:00:00.000

Now that’s a query we can all be interested in!

But what is really important here is that by combining our two tables, we’ve got a really nice way of identifying working days and numbering them within a month if need be.

Conclusion

We have described how to create an auxiliary Calendar table in your SQL database and provided a high-performance, alternative FUNCTION (GenerateCalendar) that you can use instead if you want to. Since the FUNCTION returns a table, it can be substituted pretty much wherever the Calendar table is used (after FROM), but you do need to take care to specify the correct number of days you’ll need to drive the results.

Note that the GenerateCalendar FUNCTION is also somewhat unique in that it will generate days backwards from a starting date (by supplying a negative number of days). Some sample queries are embedded in its comments to guide you.

We’ve described three cases where a Calendar table can be useful:

  • To simplify what would otherwise be complex date calculations.
  • To fill in dates (e.g., a month) where those dates are missing from your data.
  • To combine with a Holidays table to calculate working days.

Note that there are some people that advocate marking days in your main Calendar table as holidays. I personally do not like that approach, mainly because Thailand has the tendency to create new holidays on a whim. I’d rather insert a new row into my Holidays table than update an existing row in my Calendar table, but there are other reasons why you may want to keep them separate.

Just like Tally tables, it is possible to create an in-line Calendar table that contains only the dates you need for your specific query. While you may still need to perform any specific date calculations you need within that query (like obtaining the month or day as an integer), it may still offer some usability in cases where you need to fill in dates like in our Sales order example. We’ll also leave this as an exercise for our interested readers.

We hope you found this blog entry on Calendar tables to be interesting, informative and at least a little bit of fun.

For additional reading on Calendar tables, you can look at these three articles by Todd Fifield:

Follow me on Twitter: @DwainCSQL

Copyright © Dwain Camps 2014 All Rights Reserved