The T-SQL ROW_NUMBER() Function

Posted on Updated on

If you learn one new T-SQL (i.e., Microsoft SQL Server) concept today it should be ROW_NUMBER().  Introduced in SQL 2005, this function is one of 4 window ranking functions (the others are RANK(), DENSE_RANK() and NTILE()).  Oracle SQL has a similar capability.

Let’s first create some sample data we can use for a demonstration.

CREATE TABLE #ROWNUMBER_Demo
(
    ID      INT
    ,MyDate DATETIME
    ,Price  MONEY
    ,PRIMARY KEY (ID, MyDate)
);

INSERT INTO #ROWNUMBER_Demo
SELECT 1, '2012-03-04', 23.22
UNION ALL SELECT 1, '2012-03-15', 25.15
UNION ALL SELECT 1, '2012-05-10', 28.47
UNION ALL SELECT 2, '2012-02-28', 15.10
UNION ALL SELECT 2, '2012-03-22', 18.22
UNION ALL SELECT 2, '2012-05-01', 21.43
UNION ALL SELECT 3, '2012-04-01', 45.06
UNION ALL SELECT 3, '2012-05-12', 48.23
UNION ALL SELECT 3, '2012-06-01', 51.66;

SELECT *
FROM #ROWNUMBER_Demo;

The results in our sample table are:

ID   MyDate                    Price
1    2012-03-04 00:00:00.000   23.22
1    2012-03-15 00:00:00.000   25.15
1    2012-05-10 00:00:00.000   28.47
2    2012-02-28 00:00:00.000   15.10
2    2012-03-22 00:00:00.000   18.22
2    2012-05-01 00:00:00.000   21.43
3    2012-04-01 00:00:00.000   45.06
3    2012-05-12 00:00:00.000   48.23
3    2012-06-01 00:00:00.000   51.66

To use the ROW_NUMBER() function you must supply an OVER predicate consisting of PARTITION (optional) and ORDER BY (required).  Let’s take a look at an example and see the results.

SELECT ID, MyDate
    ,rn1=ROW_NUMBER() OVER (PARTITION BY ID ORDER BY MyDate)
    ,rn2=ROW_NUMBER() OVER (ORDER BY MyDate)
FROM #ROWNUMBER_Demo
ORDER BY ID, MyDate;

The results returned are:

ID  MyDate                    rn1   rn2
1   2012-03-04 00:00:00.000   1     2
1   2012-03-15 00:00:00.000   2     3
1   2012-05-10 00:00:00.000   3     7
2   2012-02-28 00:00:00.000   1     1
2   2012-03-22 00:00:00.000   2     4
2   2012-05-01 00:00:00.000   3     6
3   2012-04-01 00:00:00.000   1     5
3   2012-05-12 00:00:00.000   2     8
3   2012-06-01 00:00:00.000   3     9

For rn1 (where PARTITION is applied) you see that it assigns row numbers 1, 2, 3 to the rows within each ID (the column specified to PARTITION on) based on the ordering of MyDate.  For the case without PARTITION (rn2), the entire set is the partition so the row numbers are 1, 2, …, 9, again based on the ordering of the MyDate column.

Eliminating Duplicates

ROW_NUMBER() is a particularly fast way to eliminate duplicate records.  Suppose you want to return only one record within each ID; specifically the one whose date is the latest.  You must note that ROW_NUMBER() cannot be used on the WHERE clause, so it is necessary to wrap this query in an outer query as follows:

SELECT ID, MyDate
FROM
(
    SELECT ID, MyDate
        ,rn1=ROW_NUMBER() OVER (PARTITION BY ID ORDER BY MyDate DESC)
    FROM #ROWNUMBER_Demo
) a
WHERE rn1 = 1
ORDER BY ID;

Note the DESC sort applied to MyDate.  These results are:

ID   MyDate
1    2012-05-10 00:00:00.000
2    2012-05-01 00:00:00.000
3    2012-06-01 00:00:00.000

Of course, you’re probably saying you can achieve the same results using a GROUP BY (and you’d be correct), like this.

SELECT ID, MyDate=MAX(MyDate)
FROM #ROWNUMBER_Demo
GROUP BY ID
ORDER BY ID;

But try using that query to also return the Price column that corresponds to the MAX date.  You cannot!  But you can when you use ROW_NUMBER().

SELECT ID, MyDate, Price
FROM
(
    SELECT ID, MyDate, Price
        ,rn1=ROW_NUMBER() OVER (PARTITION BY ID ORDER BY MyDate DESC)
    FROM #ROWNUMBER_Demo
) a
WHERE rn1 = 1
ORDER BY ID;

Results:

ID     MyDate Price
1      2012-05-10 00:00:00.000    28.47
2      2012-05-01 00:00:00.000    21.43
3      2012-06-01 00:00:00.000    51.66

Conclusion

ROW_NUMBER() is a very versatile T-SQL window ranking function.  Besides using it to eliminate duplicates, it has a great many other very practical purposes that we’ll explore in future entries on this blog.

Follow me on Twitter: @DwainCSQL

Copyright © Dwain Camps 2014 All Rights Reserved

Advertisements

One thought on “The T-SQL ROW_NUMBER() Function

    Tally Tables in T-SQL | dwaincsql said:
    March 27, 2014 at 6:39 pm

    […] SQL Server 2005 came along and introduced the Common Table Expression (CTE) and the ROW_NUMBER() window function (both the subject of earlier blogs as linked), it opened up some new ways to create […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s