Month: March 2015

An Easter SQL

Posted on Updated on

Since Easter is nearly upon us, I got to thinking once again about how challenging it is to calculate the day that Easter Sunday falls on.  Since Calendar tables and their step-children Holiday tables are so interesting I thought I might write a sequel to my earlier blog link in this sentence.

There are at least three holidays (not necessarily business holidays) that are related to the calculation of Easter Sunday.  These are:

  • Good Friday is the Christian holiday that is the Friday just prior to Easter Sunday.
  • Ash Wednesday is another Christian holiday (the first day of Lent) that falls precisely 46 days prior to Easter Sunday.
  • Mardi Gras (a.k.a. Fat Tuesday) is a lesser known holiday celebrated by the more robust partyers among us.  I have a special affinity for it as I lived in New Orleans for a few years.

We will show you how to calculate them all and use that method to populate a Holidays lookup table so hopefully you’ll never have to do it again.  Happy Easter!

A Calendar and a Holiday Table

Discussions of Calendar tables in T-SQL are ubiquitous (here’s one by Todd Fifield), differing only in the columns actually stored in them.   We’ll draw upon the one we created in our blog to show you how easy it is here.

-- Create a Calendar table
SELECT [Date]=ISNULL([Date], 0), [Year]=ISNULL([Year], 0), [YrNN]=ISNULL([YrNN], 0)
    ,[YYYYMM]=ISNULL([YYYYMM], 0), [BuddhaYr], [Month], [Day]=ISNULL([Day], 0)
    ,[WkDNo], [WkDName], [WkDName2], [WkDName3], [JulDay], [JulWk]
    ,[WkNo], [Qtr], [Last], [LdOfMo], [LDtOfMo]
INTO dbo.Calendar 
-- This function can be obtained here: GenerateCalendar FUNCTION
FROM dbo.GenerateCalendar('1990-01-01', 65536);

-- Change column types to be NOT NULL so we can index them
--
-- Note: not needed thanks to the neat trick with ISNULL above
--ALTER TABLE dbo.Calendar ALTER COLUMN [Date] DATETIME NOT NULL;
--ALTER TABLE dbo.Calendar ALTER COLUMN [Year] INT NOT NULL;
--ALTER TABLE dbo.Calendar ALTER COLUMN [Month] INT NOT NULL;
--ALTER TABLE dbo.Calendar ALTER COLUMN [YYYYMM] INT NOT NULL;
--ALTER TABLE dbo.Calendar ALTER COLUMN [Day] INT NOT NULL;
GO

-- Build some representative INDEXes
ALTER TABLE dbo.Calendar 
ADD CONSTRAINT Cal_pk PRIMARY KEY([Date]);

ALTER TABLE dbo.Calendar 
ADD CONSTRAINT Cal_ix1 UNIQUE NONCLUSTERED([Year], [Month], [Day]);

ALTER TABLE dbo.Calendar 
ADD CONSTRAINT Cal_ix2 UNIQUE NONCLUSTERED([YYYYMM], [Day]);

SELECT TOP 10 [Date], [YYYYMM], [WkDName2], [WkNo], [Last]
FROM dbo.Calendar;

-- Now create a Holidays table, from the same blog
CREATE TABLE dbo.Holidays
(
    FromDate    DATETIME PRIMARY KEY
    ,ToDate     DATETIME
    ,Holiday    VARCHAR(100)
);

We’ve created a Holidays table that we’ll be using to further our sequel to Easter.  There are also a few other things in the above Data Definition Language (DDL) that are worthy of mention.

  • The GenerateCalendar FUNCTION is an in-line, Table Valued Function (iTVF) that appears in my earlier blog (and also in the resources file attached at the end of this article), which can be used in lieu of a Calendar table should you prefer to do so.  Once again, my thanks go out to SQL MVP Jeff Moden for his help optimizing the performance of the calculations done in that FUNCTION.
  • We’ve used a neat trick with ISNULL to set the NULL attribute of the column in the table we selected INTO (when Calendar gets CREATEd) to NOT NULL, so that we don’t need to run the subsequent ALTER statements on those columns.  I included those ALTER statements for your information.  Thanks again to Jeff Moden for showing me that trick (with honorable mention to SQL MCM Wayne Sheffield (MCM=Microsoft Certified Master) who came along just a few days later with the same information).

Since Easter Sunday is so diabolically difficult to calculate, many reputable sources have created look-up tables such as this one.  Drawing upon that source, we can create the dates for Easter Sunday from 1990 through 2020 in our Holidays table using the following script.

WITH EasterDates (d) AS
(
    SELECT '1990-04-15' UNION ALL SELECT '1991-03-31' UNION ALL SELECT '1992-04-19'
    UNION ALL SELECT '1993-04-11' UNION ALL SELECT '1994-04-03' 
    UNION ALL SELECT '1995-04-16' UNION ALL SELECT '1996-04-07' 
    UNION ALL SELECT '1997-03-30' UNION ALL SELECT '1998-04-12'
    UNION ALL SELECT '1999-04-04' UNION ALL SELECT '2000-04-23' 
    UNION ALL SELECT '2001-04-15' UNION ALL SELECT '2002-03-31' 
    UNION ALL SELECT '2003-04-20' UNION ALL SELECT '2004-04-11'
    UNION ALL SELECT '2005-03-27' UNION ALL SELECT '2006-04-16' 
    UNION ALL SELECT '2007-04-08' UNION ALL SELECT '2008-03-23' 
    UNION ALL SELECT '2009-04-12' UNION ALL SELECT '2010-04-04'
    UNION ALL SELECT '2011-04-24' UNION ALL SELECT '2012-04-08' 
    UNION ALL SELECT '2013-03-31' UNION ALL SELECT '2014-04-20' 
    UNION ALL SELECT '2015-04-05' UNION ALL SELECT '2016-03-27'
    UNION ALL SELECT '2017-04-16' UNION ALL SELECT '2018-04-01' 
    UNION ALL SELECT '2019-04-21' UNION ALL SELECT '2020-04-12'
)
INSERT INTO dbo.Holidays (FromDate, ToDate, Holiday)
SELECT FromDate=d, ToDate=d, Holiday='Easter Sunday ' + CAST(YEAR(d) AS CHAR(4))
FROM EasterDates;

Before we add Good Friday, Ash Wednesday and Fat Tuesday to our Holiday table, we’d like to check that the introductory description at least results in the correct day of the week on which the holiday falls.  We can use our Calendar table for this.

SELECT FromDate = a.FromDate+b.[days]
    ,ToDate     = a.FromDate+b.[days]
    ,b.holiday
    ,c.WkDName
FROM dbo.Holidays a
CROSS APPLY
(
    VALUES(-2, 'Good Friday ' + CAST(YEAR(FromDate) AS CHAR(4)))
        ,(-46, 'Ash Wednesday ' + CAST(YEAR(FromDate) AS CHAR(4)))
        ,(-47, 'Mardi Gras (Fat Tuesday) ' + CAST(YEAR(FromDate) AS CHAR(4)))
) b ([days], holiday)
JOIN dbo.Calendar c ON c.[Date] = a.FromDate+b.[days] 
WHERE a.Holiday LIKE 'Easter Sunday%';

This produces results like this (only the first two years are shown):

FromDate                ToDate                  holiday                        WkDName
1990-04-13 00:00:00.000 1990-04-13 00:00:00.000 Good Friday 1990               Friday   
1990-02-28 00:00:00.000 1990-02-28 00:00:00.000 Ash Wednesday 1990             Wednesday
1990-02-27 00:00:00.000 1990-02-27 00:00:00.000 Mardi Gras (Fat Tuesday) 1990  Tuesday  
1991-03-29 00:00:00.000 1991-03-29 00:00:00.000 Good Friday 1991               Friday   
1991-02-13 00:00:00.000 1991-02-13 00:00:00.000 Ash Wednesday 1991             Wednesday
1991-02-12 00:00:00.000 1991-02-12 00:00:00.000 Mardi Gras (Fat Tuesday) 1991  Tuesday

So now we can be confident to insert those holidays into our Holidays table.

INSERT INTO dbo.Holidays (FromDate, ToDate, Holiday)
SELECT FromDate = a.FromDate+b.[days]
    ,ToDate     = a.FromDate+b.[days]
    ,b.holiday
FROM dbo.Holidays a
CROSS APPLY
(
    VALUES(-2, 'Good Friday ' + CAST(YEAR(FromDate) AS CHAR(4)))
        ,(-46, 'Ash Wednesday ' + CAST(YEAR(FromDate) AS CHAR(4)))
        ,(-47, 'Mardi Gras (Fat Tuesday) ' + CAST(YEAR(FromDate) AS CHAR(4)))
) b ([days], holiday)
WHERE a.Holiday LIKE 'Easter Sunday%';

A couple of notes on the data types used in our Calendar and Holidays tables:

  • The Date column of the Calendar table was chosen to be DATETIME to retain compatibility with SQL 2005 (corresponding to the GenerateCalendar FUNCTION).  You could just as easily use a DATE datatype if you’re running SQL 2008 or later.
  • The FromDate/ToDate data types are also both DATETIME.  Here it could be considered important to keep the DATETIME data type to record cases where your business offers half-day holidays.  For example, I’ve seen this for Christmas Eve and New Year’s Eve.
  • You can also define a different FromDate than ToDate in the Holidays table to indicate a holiday that spans several days, like the Thailand Songkran holiday example from my blog.
“Since SQL is a database language, we prefer to do look ups and not calculations.” 
    –- Joe Celko (prolific author of books on things SQL)

There you go Joe, we’ve honored your preference!  Now we can do lookups of not just Easter Sunday, but also Good Friday, Ash Wednesday and Mardi Gras, the latter potentially being essential to planning your next trip to New Orleans.

The Calculation of Easter Sunday

A somewhat accurate description of the calculation of Easter Sunday is:

“The Sunday following the full Moon which falls on or after the equinox (the Paschal full moon) will give the lawful Easter.”
    –- From the Wikipedia link on Easter Sunday above

I say “somewhat accurate” because there’s all sorts of ifs, buts and caveats in that description, which you can read about in the Wikipedia link, however it will suffice for the time being.

The formal description of the calculation of Easter is known as Computus (no relation to Locutus of Borg), and is provided by Wikipedia at the link shown.  We shall observe this calculation in what follows.  While the formulas given are indicated to be valid for the years 1900 to 2199, since our Calendar table has been populated with years 1990-2169 we’ll limit our calculations of Easter Sunday to just those the years.  Years 1990 through 2020 will be checked against the lookup table we’ve already provided.

Of course, this calculation appears often enough in the literature and implemented in T-SQL (for example: here and here), often using methods I’d prefer to avoid due to performance.  Since our personal mantra is: “No Loops.  No Cursors.  No RBAR. Hoo-uh!” we shall endeavor to live up to that mantra today.  Think about that term “RBAR” and if you’ve never heard of it before, you’ll find it mentioned in the Conclusion to this article.

We’ll start by attempting to replicate the tabular example for calculating the Paschal full moon (PFMD) within the Gregorian calendar shown by Computus on that Wikipedia page.
 photo Gregorian Calendar for PFMD_zpsycqyrika.png
Note that the algorithm described above is a step-by-step or procedural algorithm. Here is where many SQL implementations fail because they don’t try to convert it to one that is set-based. To do this, you must think of each step in the procedural algorithm as a query, and as you proceed from one step to the next, each new step is another outer layer to your set-based query.

This is easier shown that described, so here is that query for 2013-2032.

SELECT [Year], [Y MOD 19], Addendum
    ,PFMD       = CASE WHEN PFMD > 31 THEN PFMD-31 ELSE PFMD END
    ,[Month]    = CASE WHEN PFMD > 31 THEN 'Apr' ELSE 'Mar' END
    ,[MonthNo]  = CASE WHEN PFMD > 31 THEN 4 ELSE 3 END
FROM
(
    SELECT [Year], [Y MOD 19], Addendum
        ,PFMD   = (45-([Y MOD 19]*11)%30+Addendum)
    FROM
    (
        SELECT [Year]
            ,[Y MOD 19] = [Year]%19
            ,Addendum   = CASE [Year]%19 
                            WHEN 5 THEN 29 
                            WHEN 16 THEN 29 
                            WHEN 8 THEN 30 
                            ELSE 0 
                            END 
        FROM dbo.Calendar
        WHERE [Month] = 1 AND [Day] = 1
    ) a
) a
WHERE [Year] BETWEEN 2013 AND 2032;

Note how the Addendum (CASE) implements the first two IFs (second and third rows) of the procedural algorithm. The results from this query for just the first 10 years are:

Year  Y MOD 19  Addendum  PFMD  Month  MonthNo
2013  18        0         27    Mar    3
2014  0         0         14    Apr    4
2015  1         0         3     Apr    4
2016  2         0         23    Mar    3
2017  3         0         11    Apr    4
2018  4         0         31    Mar    3
2019  5         29        18    Apr    4
2020  6         0         8     Apr    4
2021  7         0         28    Mar    3
2022  8         30        16    Apr    4

If we take PFMD and add the first letter of the month column that follows (for example for 2013: 27M) we see exact correspondence with the row from the Computus table (graphic) labeled the “Paschal full moon date,” including for the years not shown in the results set.  So it appears that we have rather simply converted the formula for calculating PFMD from Computus to a SQL script without too much effort.

Now we’ll use our Calendar table yet again, along with a little additional magic:

  • Put the pieces of the Paschal full moon date back together into a date using a formula that includes a couple of calls to the T-SQL built-in function DATEADD.
  • Do a CROSS APPLY (explained in Part 1 and Part 2 by SQL MVP Paul White) to pick up the Sunday after the Paschal full moon, which is Easter Sunday.
WITH CalculatePaschalFullMoon AS
(
    -- Original query to calculate the components of the 
    -- Paschal Full Moon date
    SELECT a.[Year], [Y MOD 19], Addendum
        ,PFMD       = CASE WHEN PFMD > 31 THEN PFMD-31 ELSE PFMD END
        ,[Month]    = CASE WHEN PFMD > 31 THEN 'Apr' ELSE 'Mar' END
        ,[MonthNo]  = CASE WHEN PFMD > 31 THEN 4 ELSE 3 END
        -- Put the date/time pieces together to get a real DATETIME for PFMD
        ,[PFMD2]    =
            DATEADD(day, CASE WHEN PFMD > 31 THEN PFMD-31 ELSE PFMD END-1 -- PFMD
                ,DATEADD(month, CASE WHEN PFMD > 31 THEN 4 ELSE 3 END-1   -- [MonthNo]
                    , a.[Date]))
    FROM
    (
        SELECT [Year], [Y MOD 19], Addendum, [Date]
            ,PFMD   = (45-([Y MOD 19]*11)%30+Addendum)
        FROM
        (
            SELECT [Year]
                ,[Y MOD 19] = [Year]%19
                ,Addendum   = CASE [Year]%19 
                                WHEN 5 THEN 29 
                                WHEN 16 THEN 29 
                                WHEN 8 THEN 30 
                                ELSE 0 
                                END
                ,[Date] 
            FROM dbo.Calendar
            WHERE [Month] = 1 AND [Day] = 1
        ) a
    ) a
)
SELECT a.[Year]
    ,PaschalFullMoon=CAST(a.PFMD2 AS DATE)
    ,EasterSunday
FROM CalculatePaschalFullMoon a
-- Easter Sunday follows the Paschal Full Moon date
-- so pick that up from the Calendar table
CROSS APPLY
(
    SELECT TOP 1 EasterSunday=CAST([Date] AS DATE)
    FROM dbo.Calendar b
    WHERE b.[Date] > a.PFMD2 AND b.WkDName3 = 'Sun'
    ORDER BY [Date]
) b
WHERE [Year] BETWEEN 2013 AND 2032;

 

Now for a quiz on the above code where the lines are bold:

Why do we subtract 1 from each of the CASE statements that appear as the second argument to DATEADD?

We’ll give you the answer to this in the Conclusion section of this article, so stay tuned…

Here are the first ten years of the results set from the last T-SQL query shown above:

Year  PaschalFullMoon  EasterSunday
2013  2013-03-27       2013-03-31
2014  2014-04-14       2014-04-20
2015  2015-04-03       2015-04-05
2016  2016-03-23       2016-03-27
2017  2017-04-11       2017-04-16
2018  2018-03-31       2018-04-01
2019  2019-04-18       2019-04-21
2020  2020-04-08       2020-04-12
2021  2021-03-28       2021-04-04
2022  2022-04-16       2022-04-17

Our Computus Wikipedia page also has the inset table (at right) presenting the Western calculation of Easter Sunday, which corresponds quite nicely to our calculation of Easter Sunday.

That really wasn’t so hard was it? Now comes the packaging into a tool.

Since we like to keep things clean, we’re going to make this wonderful bit of SQL code into an in-line, table-valued function that we can call anytime we need to calculate Easter Sunday for a specific year.

We’ll then continue on to use our tool to calculate the other holidays by using the known number of days offset, just like we did before with our Easter Sunday lookup table, and insert that result into our Holidays table for years that are missing.

 photo Easter Sundays_zpsrruxck4b.png
CREATE FUNCTION CalculateEasterSunday
(
    @Year       INT
)
-- Calculate Easter Sunday for any given year
RETURNS TABLE WITH SCHEMABINDING
AS RETURN

WITH CalculatePaschalFullMoon AS
(
    -- Original query to calculate the components of the 
    -- Paschal Full Moon date
    SELECT a.[Year] --, [Y MOD 19], Addendum
        --,PFMD       = CASE WHEN PFMD > 31 THEN PFMD-31 ELSE PFMD END
        --,[Month]    = CASE WHEN PFMD > 31 THEN 'Apr' ELSE 'Mar' END
        --,[MonthNo]  = CASE WHEN PFMD > 31 THEN 4 ELSE 3 END
        -- Put the date/time pieces together to get a real 
        -- DATETIME for PFMD
        ,[PFMD2]    =
            DATEADD(day, CASE WHEN PFMD > 31 
                                THEN PFMD-31 
                                ELSE PFMD 
                                END-1           -- PFMD
                ,DATEADD(month, CASE 
                                WHEN PFMD > 31 
                                THEN 4 
                                ELSE 3 
                                END-1           -- [MonthNo]
                    , a.[Date]))
    FROM
    (
        SELECT [Year], [Y MOD 19], Addendum, [Date]
            ,PFMD   = (45-([Y MOD 19]*11)%30+Addendum)
        FROM
        (
            SELECT [Year]
                ,[Y MOD 19] = [Year]%19
                ,Addendum   = CASE [Year]%19 
                                WHEN 5 THEN 29 
                                WHEN 16 THEN 29 
                                WHEN 8 THEN 30 
                                ELSE 0 
                                END
                ,[Date] 
            FROM dbo.Calendar
            WHERE [Month] = 1 AND [Day] = 1 AND [Year] = @Year
        ) a
    ) a
)
SELECT a.[Year]
    ,PaschalFullMoon=CAST(a.PFMD2 AS DATE)
    ,EasterSunday
FROM CalculatePaschalFullMoon a
-- Easter Sunday follows the Paschal Full Moon date
-- so pick that up from the Calendar table
CROSS APPLY
(
    SELECT TOP 1 EasterSunday=CAST([Date] AS DATE)
    FROM dbo.Calendar b
    WHERE b.[Date] > a.PFMD2 AND b.WkDName3 = 'Sun'
    ORDER BY [Date]
) b;

You can see where I commented out a few lines of code in the FUNCTION that represent intermediate results that our FUNCTION doesn’t really need to retain. Notice also how the filter on the @Year parameter was moved up into the innermost SELECT from the Calendar table, because when using the FUNCTION you’ll be selecting only one row at a time at that point. Calling the iTVF using CROSS APPLY in-lines the code to make this a non-RBAR query.

Now we can use our FUNCTION to check that the dates we’ve prior stored for Easter Sunday in our Holidays table are correct.

SELECT HolidayFromDate  = CAST(a.FromDate AS DATE)
    ,b.EasterSunday
    ,[Check]            = CASE 
                            WHEN a.FromDate = b.EasterSunday 
                            THEN 'MATCHED!' 
                            ELSE 'WRONG!' 
                            END
FROM dbo.Holidays a
CROSS APPLY dbo.CalculateEasterSunday(YEAR(a.FromDate)) b
WHERE Holiday LIKE 'Easter Sunday%';

The first ten years of our results show us all MATCHED! (and so do the others that are not shown).

HolidayFromDate  EasterSunday   Check
1990-04-15       1990-04-15     MATCHED!
1991-03-31       1991-03-31     MATCHED!
1992-04-19       1992-04-19     MATCHED!
1993-04-11       1993-04-11     MATCHED!
1994-04-03       1994-04-03     MATCHED!
1995-04-16       1995-04-16     MATCHED!
1996-04-07       1996-04-07     MATCHED!
1997-03-30       1997-03-30     MATCHED!
1998-04-12       1998-04-12     MATCHED!
1999-04-04       1999-04-04     MATCHED!

So we can be reasonably confident of our Easter Sunday calculation for the years indicated by Computus (1900-2199)!

Using our FUNCTION to Create Lookup Values in our Holidays Table

Now to keep Joe Celko happy (i.e., so we can do lookups instead of calculations), let’s insert the rest of our holidays, including Easter Sunday, Good Friday, Ash Wednesday and Mardi Gras, into our Holidays table, using all the years available to us in our Calendar table.

INSERT dbo.Holidays (FromDate, ToDate, Holiday)
SELECT FromDate, ToDate, holiday
FROM
(
    SELECT FromDate = DATEADD(day, c.days, b.EasterSunday)
        ,ToDate     = DATEADD(day, c.days, b.EasterSunday)
        ,c.holiday
    FROM dbo.Calendar a
    CROSS APPLY dbo.CalculateEasterSunday(a.[Year]) b
    CROSS APPLY
    (
        VALUES (0, 'Easter Sunday ' + CAST(a.[Year] AS CHAR(4)))
            ,(-2, 'Good Friday ' + CAST(a.[Year] AS CHAR(4)))
            ,(-46, 'Ash Wednesday ' + CAST(a.[Year] AS CHAR(4)))
            ,(-47, 'Mardi Gras (Fat Tuesday) ' + CAST(a.[Year] AS CHAR(4)))
    ) c ([days], holiday)
    WHERE a.[Month] = 1 AND a.[Day] = 1
) a
WHERE NOT EXISTS
(
    SELECT 1
    FROM dbo.Holidays x
    WHERE x.FromDate = a.FromDate AND x.ToDate = a.ToDate
);

This added 596 holidays to our table, and I’d guess we’re pretty well set for our lifetimes!

Notice how, in the above query we had to use DATEADD instead of simple arithmetic as we did before when using the DATETIME data type.  This is because simple arithmetic doesn’t work with the DATE data type returned by our CalculateEasterSunday function.

Conclusion and What we Learned

While Joe likes lookups, I find calculations in T-SQL to be quite a bit of fun.  In this article (as in my blog on Calendar tables), we got to do a little of both.

If you went ahead and read my blog on Calendar tables in conjunction with this article, I am pleased because you probably learned a lot about how useful they can be, particularly in conjunction with a Holidays table. In this article (in the FUNCTION) we used our Calendar table to find the next Sunday occurring after our calculated date of the Paschal full moon.

Besides learning about the trivialities of the relationship between Easter, Good Friday, Ash Wednesday, and planning for your next visit to New Orleans for Mardi Gras, we’d like to think that you may have learned a few other things as well:

  • You can use simple arithmetic on a DATETIME data type to add or subtract days, however you must use DATEADD to do the same thing with a DATE data type.
  • You now have efficient tools in your toolbox to generate a calendar table (GenerateCalendar) and CalculateEasterSunday, whenever you have a need to use them.  The latter function is also included in the resources file attached at the end of this article.
  • You learned how to CROSS APPLY VALUES to do an UNPIVOT operation, even though I didn’t specifically mention that was what we were doing when we used it (where we created rows with dates for Good Friday, Ash Wednesday and Mardi Gras).  If you haven’t seen that method before you may want to look into that link as well. That is something that will only work in SQL 2008 (or later), but there is an equivalent construct that will work in SQL 2005.
  • We learned how to use ISNULL on a SELECT/INTO operation, to set the attribute of the column created in the target table to NOT NULL.
  • You learned how to think your way through a rather complex, procedural calculation and convert it into set-based code, by wrapping each successive step in another layer of query.
  • You learned my personal SQL mantra, in case you hadn’t seen that before.

In case you’ve been wondering what RBAR stands for, it means “Row By Agonizing Row” and it is a term coined by Jeff Moden which I fortuitously picked up on before I got to know him.

Oh yes, then there’s the quiz I mentioned. There are two cases:

  • If we’re starting with 01 Jan of a year as our date, and the Paschal full moon month is three (meaning it should fall in Mar), if we add three months to 01 Jan it puts us in Apr. So we subtract one month to get us back to where we need to be.
  • If we’re starting with 01 Mar of a year as our date, and the Paschal full moon day within that month is (say) 14 (meaning it should fall on 14 Mar), if we add 14 days to 01 Mar it puts us in 15 Mar. So we subtract one day to get us back to where we need to be.

After all this, I still have no clue what a Paschal full moon is but what the heck, at least I know I can calculate when it is!  That’s all for today folks.  I hope you found this article useful and just a bit of fun.

Follow me on Twitter: @DwainCSQL

© Copyright Dwain Camps 01 Apr 2015.  All rights reserved.

Resources for An Easter SQL.doc

The Resources above are stored in a Microsoft Word (TM) document. There are two sections:

-- Create GenerateCalendar, Calendar and Holiday Tables and Populate.sql
... page break ...
-- Create CalculateEasterSunday and Add Holidays.sql

Stupid T-SQL Tricks – Part 2: Retrieving Columns Set to the Current DEFAULT CONSTRAINT Value

Posted on Updated on

“Heeeeeere’s Johnny!” – Ed McMahon introducing Johnny Carson for the Tonight Show

If you’re old enough to remember the Tonight Show hosted by Johnny Carson, how can you not remember the iconic Stupid Pet Tricks skits, and the later tributes to those done by David Letterman?

My Stupid T-SQL Tricks are hopefully a bit more useful, but just as amusing as Carson’s Stupid Pet Tricks were! Hopefully everyone knows what a DEFAULT CONSTRAINT is for a column in a T-SQL table.  Let’s first create a table with such a column and populate it with some sample rows.

CREATE TABLE #SampleData
(
    MyPrimaryKey    VARCHAR(10) NOT NULL
    ,MyDataColumn   INT
    ,MyDataColumn2  INT 
    ,CONSTRAINT sd_pk PRIMARY KEY (MyPrimaryKey)
);

ALTER TABLE #SampleData
ADD CONSTRAINT sd_df1
DEFAULT (50) FOR MyDataColumn2;

INSERT INTO #SampleData (MyPrimaryKey, MyDataColumn)
SELECT 'Dwain.C', 41 UNION ALL SELECT 'Joe.C', 54
UNION ALL SELECT 'Chris.M', 44 UNION ALL SELECT 'Jeff.M', 33;

SELECT *
FROM #SampleData;

We can see from the following results that when the column is not specified on INSERT, the default value applies to each row created.

MyPrimaryKey  MyDataColumn  MyDataColumn2
Chris.M       44            50
Dwain.C       41            50
Jeff.M        33            50
Joe.C         54            50

Let’s change a couple of the rows to a different value.

UPDATE #SampleData
SET MyDataColumn2 = 52
WHERE MyPrimaryKey IN ('Dwain.C','Jeff.M');

SELECT *
FROM #SampleData;

So now we have this:

MyPrimaryKey  MyDataColumn  MyDataColumn2
Chris.M       44            50
Dwain.C       41            52
Jeff.M        33            52
Joe.C         54            50

Suppose we want to write a generalized query that returns all rows that are set to the current default value for the column that has a default constraint.  We know that column is MyDataColumn2.

DECLARE @MyColDefault INT;

BEGIN TRANSACTION T1;

UPDATE TOP (1) #SampleData
SET @MyColDefault = MyDataColumn2 = DEFAULT;

ROLLBACK TRANSACTION T1;

SELECT *
FROM #SampleData
WHERE MyDataColumn2 = @MyColDefault;

Returns us these results:

MyPrimaryKey  MyDataColumn  MyDataColumn2
Chris.M       44            50
Joe.C         54            50

Using the TRANSACTION/ROLLBACK, we’ve pretended to set one row back to its DEFAULT value, while saving that DEFAULT value into a local variable which we can then use in the subsequent, filtered SELECT. Using the TOP (1) option on the UPDATE ensures that we’ve only touched a single row to get the DEFAULT value that applies to all rows.

You’ll find that your code is now impervious to some pesky DBA changing the default constraint value for that column.

ALTER TABLE #SampleData
DROP CONSTRAINT sd_df1;

ALTER TABLE #SampleData
ADD CONSTRAINT sd_df1
DEFAULT (52) FOR MyDataColumn2;

DECLARE @MyColDefault INT;

BEGIN TRANSACTION T1;

UPDATE TOP (1) #SampleData
SET @MyColDefault = MyDataColumn2 = DEFAULT;

ROLLBACK TRANSACTION T1;

SELECT *
FROM #SampleData
WHERE MyDataColumn2 = @MyColDefault;

So now we get different results:

MyPrimaryKey  MyDataColumn  MyDataColumn2
Dwain.C       41            52
Jeff.M        33            52

What we learned from these examples:

  • How to create a named DEFAULT constraint for a column.
  • How to change that constraint, by first dropping it and then creating it again.
  • How to write a query to return all rows where the value of a column is set to its current default.

Now how is that for a stupid, T-SQL trick?

Follow me on Twitter: @DwainCSQL

© Copyright Dwain Camps 19 Mar 2015.  All rights reserved.

Lessons Learned from a Poor-performing VIEW

Posted on Updated on

It is a “common knowledge” among SQL practitioners that VIEWs don’t perform well. Recently during an application’s development I put this to the test. I needed to construct a VIEW for a report the system was producing. It was highly complex. Let’s take a look at the general structure of the VIEW that I ended up with, along with how it was called by the report.

CREATE VIEW dbo.v_VeryComplex
WITH SCHEMABINDING
AS
WITH aCTE AS
(
    SELECT something
    FROM dbo.a_table a
    JOIN dbo.b_table b ON a.key = b.key
)
SELECT col1, col2, col3
-- many columns including a few calculated ones
FROM
(
    SELECT col, col2, col3-  - plus many more
    FROM dbo.table1 a
    JOIN dbo.table2 b ON a.key = b.key
    -- plus about 14 more JOINs and CROSS APPLYs
    -- among them the aCTE defined above and at 
    -- least one call to an schema-bound, in-line
    -- table valued function
    WHERE a.col = 'abc'
) a;
GO
-- The report calls it thusly 
SELECT col1, col2, col3 -- plus many more
FROM v_VeryComplex
WHERE col2 = 'cde' AND col3 = 'def'
ORDER BY sortcol1, sortcol2;

When I checked the Execution Plan for calling this VIEW (as the report does) during development, I was quite happy to see that every table it hit did an INDEX SEEK. So all of my JOINs, etc. were in order and it looked like it would happily perform with swiftness. And while I was on-site for the Go Live of the application, it most certainly performed with alacrity.

Then about four months later comes the dreaded call from the user. My beloved VIEW with all of those SEEKs was taking 30-40 seconds to return its results.

My first reaction was to be incredulous of course. Perhaps the report rendering to the browser was taking a long time. So I set about to testing it. I captured a query from the report using SQL Profiler and then ran it on the production server, where by this time some 12,000+ transactions had been entered into Table2 (from the VIEW above). Table1 is a configuration table and is quite small.

Sure enough it was dog-slow, returning in about 40 seconds. Not good.

A quick check of the execution plan on the production server showed that I was now suddenly getting an INDEX SCAN on Table2, which means that the VIEW was performing all of its work on all of the transactions in that table including a whack of complex calculations, summaries of detail records in child tables, etc. The final filtering on the VIEW then was not being transferred into an INDEX SEEK like it used to be!

A little more about the application so you can better appreciate my position. First of all, the application itself is mission critical for this customer. It has to work, and all of the work it processes is quite time sensitive. The report itself isn’t used for every transaction in the table, in fact probably less than 20-25% use this particular report to be processed. Nonetheless, those transactions are just as time-sensitive as all the rest. And because it involves some governmental reporting, it must be 100% accurate. Inaccuracies in the calculated amounts could lead to fines to the customer.

With more than a little trepidation, I set about to rewriting the VIEW in the hopes that I could remove the outer derived table (a) to perhaps return my call to the VIEW to doing an INDEX SEEK on Table2. Fortunately, I had at my disposal a local copy of an older version of the production database that exhibited the same problem – an INDEX SCAN on Table2.

After not a large amount of effort, I was able to do that rewrite and get my INDEX SEEK! So I happily created the new VIEW in the production server to continue my testing to ensure the accuracy of my results. The way I did that was to run the original VIEW, followed by EXCEPT, followed by the results from the new VIEW. Since I was running both VIEWS against all rows in Table2, this should highlight any discrepancies (after checking first that both returned the same number of rows). The result of that test were no rows returned, establishing with a very high degree of certainty that my rewrite was producing the correct results. I was nearly beside myself with glee!

Alas, that emotional state was short-lived because when I checked the execution plan of the call to the new VIEW in production, I found that I was back to the INDEX SCAN on Table2.

In fact, when I modified the filtering criteria to return the single transaction it was intended to return, I got virtually the same execution time. Not good!

What’s a performance evangelist like me to do? I certainly couldn’t go back to the client and tell him there’s nothing that can be done.

So I did a quick re-evaluation of the situation. My first thought was “perhaps I can INDEX the VIEW.”

That was not an option because:

  • You can’t INDEX a VIEW that uses common table expressions (CTE).
  • While I could modify the VIEW and move that CTE into a derived table, I was still left with several CROSS APPLYs, and you can’t INDEX a VIEW that uses APPLY either!

Rewriting all of those CROSS APPLYs into JOINs was simply not an appealing prospect to me.

Another rather annoying aspect of INDEXing the VIEW, were I able to accomplish it, is that the VIEW is schema-bound and it calls that pesky schema-bound, in-line table valued function (iTVF) mentioned in the comments in the code shown above. There is already a specific hierarchy to the deployment when one of the lowest level schema-bound objects needs changing (like that iTVF) that is relatively a pain. It involves dropping the VIEW and re-creating it after altering the iTVF. If I were to INDEX the VIEW, I’d always need to remember to recreate that INDEX anytime I had to deploy anything in that hierarchy. Not good either.

Then it hit me like a 2×4 to the head! When the report calls my VIEW, it does it for a single transaction only. So that gave me the solution.

I could copy all of the original code from the VIEW and instead create a schema-bound, ITVF. That would look something like this.

CREATE FUNCTION dbo.VeryComplexFcn
(
    @parm1      VARCHAR(10)
    ,@parm2     VARCHAR(10)
)
RETURNS TABLE WITH SCHEMABINDING
RETURN
WITH aCTE AS
(
    SELECT something
    FROM dbo.a_table a
    JOIN dbo.b_table b ON a.key = b.key
)
SELECT col1, col2, col3
-- many columns including a few calculated ones
FROM
(
    SELECT col, col2, col3  -- plus many more
    FROM dbo.table1 a
    JOIN dbo.table2 b ON a.key = b.key
    -- plus about 14 more JOINs and CROSS APPLYs
    -- among them the aCTE defined above and at 
    -- least one call to an schema-bound, in-line
    -- table valued function
    WHERE a.col = 'abc' AND a.col2 = @parm1 AND b.col3 = @parm2
) a;
GO
-- The report calls the iTVF like this 
SELECT col1, col2, col3 -- plus many more
FROM dbo.VeryComplexFcn ('cde', 'def')
ORDER BY sortcol1, sortcol2;

It was in short order that I deployed this new iTVF to the production server and checked the execution plan, to find that I was back to my beloved INDEX SEEK!

The query elapsed time calling the iTVF instead of the VIEW for a single transaction was impressively improved.

New (iTVF):
 SQL Server Execution Times:
 CPU time = 62 ms, elapsed time = 64 ms.
Old (VIEW):
 SQL Server Execution Times:
 CPU time = 35194 ms, elapsed time = 44423 ms.

I’m keeping the VIEW around for posterity’s sake, and of course in the event that we ever need to run it for multiple transactions.

Now I’ve just got to wait for the development team to modify the report to call the iTVF instead of the VIEW, so we can re-deploy the application. I am left with the same hierarchical deployment constraints, but those I can live with because I won’t need to remember to recreate an INDEX on a VIEW every time (SQL Server will remind me about everything else).

The lesson to be learned here is not to be beguiled by execution plans you see during development. Those can change as transactions accumulate. Even though we generated an impressive number of transactions during the development (I had some automated SQL scripts that did this), they were insufficient to reach that critical threshold where the execution plan changed.