The One Million Row T-SQL Test Harness

Posted on

So far in our blogs, we have talked a bit about performance, but today we’re going to show you a way you can confirm without doubt that you’re writing high-performance T-SQL code. This extremely valuable technique is something Developers and Testers alike should be familiar with.

Why one million (1M) rows? Well, the answer to that question is really three-fold.

  • First of all, 1M rows represent sort of a threshold, where you can start to consider your tables as being “big.”
  • Second, if your query handles 1M rows in a reasonably short period of time, it means that it will probably do OK even if the table grows much larger (which is what we hope because that judges how popular our application is).
  • Finally, depending on what you are testing, the 1M row test may make the difference between deciding on the right query (the one that is the fastest) and the wrong query, because at 1M rows it usually separates the winners from the losers quite clearly.

Along the way we’re going to learn about a variety of ways that test data can be generated. Unfortunately we won’t be able to cover every possible case. We’ll need to leave some of them to your imagination. Once you understand the basics, you should be able to extend the concept to other types of test data relatively quickly.

Case Study 1: Verify Which Formula is Fastest

Suppose you have a table that contains an INTEGER ID number column. Your application requires that when that number is shown on a form, it displays with leading zeroes. There are many ways to do this but we’ll look at three.

DECLARE @Num INT = 2342;

    ,RIGHT('000000'+CAST(@Num AS VARCHAR(7)),7)         -- Method 1
    ,RIGHT(10000000+@Num, 7)                            -- Method 2
    ,STUFF(@Num, 1, 0, REPLICATE('0', 7-LEN(@Num)));    -- Method 3

Each of these returns the same result: 0002342 and will work for any positive integer <= 9999999. But the question is which one is faster?

To answer that question we’ll construct a 1M row test harness based on the Numbers table we created in our blog on Tally Tables. We’ll also show an example of one of those methods using an in-line Tally table (from the same blog), so you can see the difference between using a permanent vs. an in-line tally table.


PRINT 'Method 1:';
SELECT @NumResult = RIGHT('000000'+CAST(N AS VARCHAR(7)),7)
FROM dbo.Numbers
WHERE N <= 1000000;

SELECT @NumResult = RIGHT(10000000+N, 7)
FROM dbo.Numbers
WHERE N <= 1000000;

PRINT 'Method 3:';
SELECT @NumResult = STUFF(N, 1, 0, REPLICATE('0', 7-LEN(N)))
FROM dbo.Numbers
WHERE N <= 1000000;

PRINT 'Method 2 w-in line Tally Table:';
WITH Tally(N) AS
    FROM sys.all_columns a CROSS JOIN sys.all_columns b
SELECT @NumResult = RIGHT(10000000+N, 7)
FROM Tally
WHERE N <= 1000000;

-- Results:
Method 1:
SQL Server Execution Times:
CPU time = 265 ms, elapsed time = 274 ms.

Method 2:
SQL Server Execution Times:
CPU time = 250 ms, elapsed time = 250 ms.

Method 3:
SQL Server Execution Times:
CPU time = 452 ms, elapsed time = 460 ms.

Method 2 w-in line Tally Table:
SQL Server Execution Times:
CPU time = 234 ms, elapsed time = 227 ms.

You should always run the test harness a few times and record the results of each run. By looking at the results, we see that methods 1 and 2 are pretty close, but over the 4-5 runs that I did, method 2 was consistently just a little faster in elapsed time. Using an in-line Tally table in this case was just a little faster than using the permanent Tally table (that may not always be the case).

You may be saying to yourself that this little 9% improvement doesn’t mean much, but picture it in the context of a much more complex query, where every slight improvement you can make counts for something.

Let’s now look at some key aspects of the test harness we used:

  • It contains multiple queries that return the identical results for the same number of rows. This is important because you’re trying to compare solutions.
  • We SET STATISTICS TIME ON before each query and OFF when it was complete. You could have turned them ON once at the start and OFF at the end, but that’s not always going to be the case.
  • We printed out a description of the method we are using just before setting STATISTICS ON. That’s the reason we turned them OFF after each query; so you wouldn’t see the statistics for the PRINT statement, which is immaterial.
  • Finally, we created a local variable @NumResult and assigned our calculated result (the returned columns) to it. This is important to eliminate the time that SQL Server Management Studio (SSMS) would otherwise take to render the results to the Results pane. If you don’t do that, it can bias the results. We’re interested in clocking the raw query speed here.

CPU time can be important sometimes, so you may also want to look at that. It turns out that method 2 also appears better than method 1 in terms of CPU time, but that may not always be the case and over the 4-5 runs we did it often was a tie.

After this test, we can say pretty clearly that method 2 was the highest performance method among the ones we looked at. We are also now well on our way to being able to say definitively that we’ve written the best possible query to solve our business problem.

Case Study 2: Removing Numbers from a Character String

In this case study, we will illustrate a few additional concepts you should know when constructing a test harness:

  • How to construct random character strings of any length.
  • Using an alternate means to capture the elapsed time of the queries we’re comparing.
  • Using an alternative to a local variable to avoid rendering results to SSMS.

The business problem is that we wish to remove numbers from character strings that contain only letters and numbers. For this task, we have written two T-SQL FUNCTIONs that basically do the job but need to be called differently.

-- Scalar Valued Function that removes characters based on a pattern match
    (@pString VARCHAR(8000), @pPattern VARCHAR(100))

SELECT @Pos = PATINDEX(@pPattern,@pString COLLATE Latin1_General_BIN);

WHILE @Pos > 0
SELECT @pString = STUFF(@pString,@Pos,1,''),

@Pos = PATINDEX(@pPattern,@pString COLLATE Latin1_General_BIN);

RETURN @pString;


-- In-line, schema-bound Table Valued Function
CREATE FUNCTION dbo.RemoveMatchedPatterns
    @Str        VARCHAR(8000)
    ,@Pattern   VARCHAR(100)

WITH Tally(n) AS
    FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a(n)
    CROSS JOIN(VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n)
    CROSS JOIN(VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n)
    CROSS JOIN(VALUES(0),(0),(0),(0),(0),(0),(0),(0)) d(n)
SplitString AS
    SELECT n, s=SUBSTRING(@Str, n, 1)
    FROM Tally
    WHERE PATINDEX(@Pattern, SUBSTRING(@Str COLLATE Latin1_General_BIN, n, 1)) = 0
SELECT ReturnedString=
        SELECT s + ''
        FROM SplitString b
        ORDER BY n
        FOR XML PATH(''), TYPE
    ).value('.', 'VARCHAR(8000)');

For now, it is not necessary to fully understand how these FUNCTIONs work, but it is necessary to understand how to use them. Note that the CleanString FUNCTION uses a WHILE loop to do its work.

DECLARE @TestString VARCHAR(8000) = '123122adflajsdf34a23aa333w';

SELECT TestString=@TestString
    ,SVFReturns=dbo.CleanString(@TestString, '%[0-9]%')
FROM dbo.RemoveMatchedPatterns(@TestString, '%[0-9]%');

-- Results:
TestString                   SVFReturns     iTVFReturns
123122adflajsdf34a23aa333w   adflajsdfaaaw   adflajsdfaaaw

Both FUNCTIONs have the same call signature, which includes the pattern of characters you want removed, and both return the same results (only the alphabetic characters).

Before we proceed to generate a test harness containing lots of random strings we can test, it is necessary to familiarize you with a way to generate random numbers in T-SQL. So let’s consider the following statement:


-- Results:
IntRN  RNwDec
31     40.74

If you run this statement multiple times, you’ll get different results each time. In each case, you get a number between 1 and 100. The RNwDec column will have 2 decimal digits. This is the standard method in SQL to generate a Uniform Random Number (URN). If you need a wider range of numbers, change 100 (or 10000 for RNwDec) to something larger.

To generate a random string of characters, you can use the URN formula as follows:

            ,1+ABS(CHECKSUM(NEWID()))%20) +

You can try this one yourself. Each time you get a different string consisting of 1 to 20 letters (always the same letter repeated) and 1 to 20 numbers (always the same number repeated). The string will be of varying length between 2 and 40 characters. To get a string of maximum length 8000, all you need to do is replicate the pattern a random number of times. The final (outer) replication should be performed up to 200 times, so can be done like this:

            ,1+ABS(CHECKSUM(NEWID()))%20) +

Now that we understand the tools we’ll need, here is the test harness.

-- Create 1000 rows of random strings
            ,1+ABS(CHECKSUM(NEWID()))%20) +
INTO #TestStrings
FROM dbo.Numbers
WHERE N <= 1000;


SELECT SVFReturns=dbo.CleanString(s, '%[0-9]%')
INTO #Test1
FROM #TestStrings;

-- Display elapsed time for the scalar-valued User-defined Function (UDF)
SELECT SVFElapsedMS=DATEDIFF(millisecond, @StartDT, GETDATE());

SELECT iTVFReturns=ReturnedString
INTO #Test2
FROM #TestStrings
CROSS APPLY dbo.RemoveMatchedPatterns(s, '%[0-9]%');

-- Display elapsed time for the in-line Table Valued Function (iTVF)
SELECT iTVFElapsedMS=DATEDIFF(millisecond, @StartDT, GETDATE());

DROP TABLE #TestStrings;

For this test, we’ll illustrate the fact that sometimes you don’t need to go up to 1M rows to distinguish the difference between two queries. In this case, it becomes quite apparent at 1000 rows. Here are the timings that are displayed by the two SELECT statements:



That’s already a pretty large difference so you can imagine how long it would take to run at 1M rows. Here’s an explanation of the differences between this test harness and the one from before:

  • We ran at 1000 rows of test data instead of 1M (by filtering N from our Numbers table with WHERE N <= 1000).
  • Instead of assigning the results to a local variable, we instead SELECT … INTO a temporary table. Since both queries absorb the same overhead for this operation, the results are still comparable.
  • Instead of using SET STATISTICS TIME ON/OFF, we’ve simply used the @StartDT local variable to capture the elapsed time (calculated using DATEDIFF).

The latter method of capturing elapsed time is used because of a quirky behavior of STATISTICS in some cases when you are timing a scalar-valued, user-defined function. SQL MVP Jeff Moden explains this in How to Make Scalar UDFs Run Faster.

This example also serves to demonstrate the well-known fact that a good set-based query will almost always be faster than a loop. If you remember our introduction to DelimitedSplit8K in our Tally Tables blog, Jeff Moden uses the same basic methodology (an in-line Tally table) to make DelimitedSplit8K extremely fast.

Case Study 3: Timing Solutions for Running Totals

In this example, we’re going to perform a test on two solutions to the Running Totals (RT) problem in T-SQL. The second solution is only valid in SQL 2012. First, we’ll set up a test table and populate it with some random data.

CREATE TABLE dbo.RunningTotalsTest
    [Date]          DATETIME PRIMARY KEY
    ,Value          INT
    ,RunningTotal1  INT
    ,RunningTotal2  INT

WITH SomeDates AS
    SELECT d=DATEADD(hour, N, '2010-01-01')
    FROM dbo.Numbers
    WHERE N <= 10000
INSERT INTO dbo.RunningTotalsTest([Date], Value)
FROM SomeDates;

We have populated our table with a series of date values that increases in an hourly fashion; including a “Value” that is simply a random integer between 1 and 100. We’ll be calculating the two RunningTotal columns from our Value. Note that at the end of this “batch” we have included the batch separator (GO). The test harness includes only 10,000 rows because we know in advance that this will be a sufficient number to distinguish between our solutions (1M rows is still recommended for most normal cases).

The first running totals solution we’ll look at is what is known as a triangular JOIN, because for each row it adds up all of the prior rows using a correlated sub-query. Once again, notice the batch separator (GO) at the end of the batch.

-- RT by Triangular JOIN
SET RunningTotal1 =
        SELECT SUM(value)
        FROM dbo.RunningTotalsTest b
        WHERE b.[Date] <= a.[Date]
FROM dbo.RunningTotalsTest a;

The next solution, which only works in SQL Server 2012, is a new facility Microsoft has kindly provided to us for calculating running totals (and a host of other things).

-- RT with SQL 2012 window frame
WITH RunningTotal AS
    SELECT [Date], Value, RunningTotal2
    FROM dbo.RunningTotalsTest
SET RunningTotal2 = rt
FROM RunningTotal a;

-- Final SELECT
FROM dbo.RunningTotalsTest;

We’ve also included a final SELECT (in the last batch) to show that both running totals were calculated correctly.

To run this code and obtain our timings, we’re going to learn to use Tools/SQL Profiler. This brings up a window allowing you to name the profiler (trace) if desired.

After you click Run, the Profile session will begin.

You can now execute the four batches of T-SQL we created above, two of which contain the solutions of interest. Once the run is complete, the Profile window now looks like this, where we have circled in red the two results of interest.

Notice how the comment we placed at the beginning of each batch, clearly shows up on the BatchCompleted lines with our desired results:

-- RT by Triangular JOIN
-- RT with SQL 2012 window frame

The results show that the new SQL 2012 method for running totals completed in only 117 milliseconds, while the triangular JOIN took 11267 milliseconds. Imagine what the triangular JOIN approach would have taken had we run against 1M rows, or better yet imagine a customer waiting on an application’s web form for that result to be displayed!

The first ten rows of results displayed show that both of our running totals solutions worked correctly, yet the timing results tell us that they are definitely not equivalent!

Date                    Value   RunningTotal1  RunningTotal2
2010-01-01 01:00:00.000   63    63             63
2010-01-01 02:00:00.000   75    138            138
2010-01-01 03:00:00.000   2     140            140
2010-01-01 04:00:00.000   27    167            167
2010-01-01 05:00:00.000   73    240            240
2010-01-01 06:00:00.000   71    311            311
2010-01-01 07:00:00.000   17    328            328
2010-01-01 08:00:00.000   64    392            392
2010-01-01 09:00:00.000   40    432            432
2010-01-01 10:00:00.000   56    488            488

SQL Profiler is a very useful way to time a batch that contains multiple SQL statements (like if you want to test performance of a CURSOR vs. a set-based solution). Take care when setting up each batch to avoid unnecessary overhead in one batch vs. the other.

In a future blog, we’ll describe a method for calculating running totals that works in any version of SQL that is faster than both of these solutions.

Summary and What We Learned

Firstly and most importantly, we’ve learned how to create a one million row test harness so we can compare the performance of two queries that return identical results, and why this is an essential step to verifying query performance.

We’ve learned how to generate random test data:

  • Using Tally tables
  • Using a formula to generate random numbers within a specified range
  • Generating random character strings

We’ve learned three methods to time the queries we’re comparing as they process the test harness:

  • Using an elapsed time calculator and displaying the results in a SELECT
  • Using SQL Profiler

Some hints to follow when using the 1M row test harness (not all of which are mentioned):

  • After creating your test harness, start out with a lower row count than a million. Once you’re done debugging the test harness, ramp up the row count until one of the solutions clearly distinguishes itself from the others.
  • Try to run each test multiple times at the highest row count you need to prove the result and then average the timings.
  • Avoid using SET STATISTICS ON/OFF when comparing code that includes a call to a Scalar-valued, User-defined FUNCTION. Use one of the other techniques instead (Profiler works fine on this case).
  • You can as easily compare 3-4 solutions as two using any of these methods. The more solutions you have to the same problem, the more chance you’ll have of identifying the best performing.
  • There are cases where using DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS will improve the accuracy of your measurement; however this is a topic that is more advanced than we planned to explore in this blog.

Additional reading:

  • Just about any article that SQL MVP Jeff Moden writes for the SQL Server Central web site provides an example of a 1M row test harness to test his proposed solution against alternatives, but these two articles are specifically directed to this subject.

o   Generating Test Data: Part 1 – Generating Random Integers and Floats
o   Generating Test Data: Part 2 – Generating Sequential and Random Dates

Now you’re ready to start learning how to write high-performance T-SQL code, because now you have a tool that is essential to proving that you’re writing queries that are the best that they can be!

Follow me on Twitter: @DwainCSQL

Copyright © Dwain Camps 2014 All Rights Reserved


6 thoughts on “The One Million Row T-SQL Test Harness

    台灣大樂透 said:
    September 21, 2014 at 9:39 am

    Awesome site, thanks a lot !!

    treasure4developer said:
    March 25, 2015 at 2:45 pm

    This is indeed good practice to go for such test for performance in development period only to avoid issues in Live. Thanks for sharing.

    […] harness here because this is just a blog, and I’m sure if you’ve read my previous blog on the One Million Row Test Harness, you’re capable of doing this yourself given everything we’ve given you […]

    […] If the two queries produce identical results, when you run the above it should produce zero rows in the results set!  Then all you’re left with is to verify that the newer query runs faster, and that’s something I covered in The One Million Row Test Harness. […]

    g003pk said:
    May 21, 2015 at 6:55 pm

    DBforge has a very affordable data generator tool that just came out. You can get a license as a blogger if you post a review. You might look into that to also help as you can generate a truly random set of varying data (linked with FK) and get 2-3 million row test tables easily to benchmark your testing against. Might be helpful

    […] naturally we’ll turn to one of our favorite performance tools, the One Million Row Test Harness. It is so easy to create one of these for this case it is almost embarrassing, the embarrassing […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s