One of the most obvious and useful set of window functions are ranking functions where rows from your result set are ranked according to a certain scheme. There are three ranking functions:

  • ROW_NUMBER()
  • RANK()
  • DENSE_RANK()

 

ROW_NUMBER()

… assigns unique numbers to each row within the PARTITION given the ORDER BY clause. So you’d get:

SELECT v, ROW_NUMBER() OVER()
FROM t

Note that some SQL dialects (e.g. SQL Server) require an explicit ORDER BY clause in the OVER() clause:

SELECT v, ROW_NUMBER() OVER(ORDER BY v)
FROM t

SELECT ROW_NUMBER() OVER(PARTITION BY recovery_model_desc ORDER BY name ASC) AS Row#, name, recovery_model_desc FROM sys.databases WHERE database_id < 5;

The above query returns:

| V | ROW_NUMBER |
|---|------------|
| a |          1 |
| a |          2 |
| a |          3 |
| b |          4 |
| c |          5 |
| c |          6 |
| d |          7 |
| e |          8 |

 

RANK()

… behaves like ROW_NUMBER(), except that “equal” rows are ranked the same. If we substitute RANK() into our previous query:

SELECT v, RANK() OVER(ORDER BY v)
FROM t

… then the result we’re getting is this:

| V | RANK |
|---|------|
| a |    1 |
| a |    1 |
| a |    1 |
| b |    4 |
| c |    5 |
| c |    5 |
| d |    7 |
| e |    8 |

As you can see, much like in a sports ranking, we have gaps between the different ranks. We can avoid those gaps by using

 

DENSE_RANK()

Trivially, DENSE_RANK() is a rank with no gaps, i.e. it is “dense”. We can write:

SELECT v, DENSE_RANK() OVER(ORDER BY v)
FROM t

… to obtain

| V | DENSE_RANK |
|---|------------|
| a |          1 |
| a |          1 |
| a |          1 |
| b |          2 |
| c |          3 |
| c |          3 |
| d |          4 |
| e |          5 |

 

One interesting aspect of DENSE_RANK() is the fact that it “behaves like” ROW_NUMBER() when we add the DISTINCT keyword.

SELECT DISTINCT v, DENSE_RANK() OVER(ORDER BY v)
FROM t

… to obtain

| V | DENSE_RANK |
|---|------------|
| a |          1 |
| b |          2 |
| e |          5 |
| d |          4 |
| c |          3 |

 

In fact, ROW_NUMBER() prevents you from using DISTINCT, because ROW_NUMBER() generates unique values across the partition before DISTINCT is applied:

SELECT DISTINCT v, ROW_NUMBER() OVER(ORDER BY v)
FROM t
ORDER BY 1, 2

DISTINCT has no effect:

| V | ROW_NUMBER |
|---|------------|
| a |          1 |
| a |          2 |
| a |          3 |
| b |          4 |
| c |          5 |
| c |          6 |
| d |          7 |
| e |          8 |

 

Share This Via: