Systems Engineering and RDBMS

Common Problem – getting the first in a Group

Posted by decipherinfosys on June 18, 2008

This is one of the very common problems in the SQL world and also one that has many different solutions available. The problem description is this:

You have a table with say this structure (using SQL Server Syntax):

set nocount on
go
declare @test table (store_nbr int, product_name nvarchar(20), product_date datetime, aisle_nbr int)
insert into @test values (1, ‘A’, ‘8-12-2004’, 6)
insert into @test values (1, ‘B’, ‘7-10-2005’, 2)
insert into @test values (1, ‘C’, ‘6-11-2006’, 3)
insert into @test values (1, ‘D’, ’11-10-2007′, 4)
insert into @test values (2, ‘E’, ’10-12-2004′, 2)
insert into @test values (2, ‘F’, ‘5-9-2005’, 4)

And what you want to find out is that record in each store which was introduced the last so essentially in each group (i.e. each store), you want to find out the product that was introduced most recently. There are a lot of ways to solve this problem and the solutions vary among RDBMS based on syntax only. One can use analytic functions to resolve this, use a derived table or use a co-related sub-query. Here is one way to solve this issue:

select * from @test A
where A.product_date = (select top 1 B.product_date
from @test as B
where A.store_nbr = B.store_nbr
order by product_date desc)

store_nbr   product_name         product_date            aisle_nbr
----------- -------------------- ----------------------- -----------
1           D                    2007-11-10 00:00:00.000 4
2           F                    2005-05-09 00:00:00.000 4

In this code, we are sorting based on the product date in a descending order in the co-related sub-query and getting the equality match based on the TOP 1 record. We could have also used the ROW_NUMBER() function as well, example:

select store_nbr, product_name, product_date, aisle_nbr
from
(select ROW_NUMBER() over (partition by store_nbr order by product_date desc) as rn, *
from @test A
) as IV
where IV.rn = 1

One can also choose to do a MAX operation in the co-related sub-query, example:

select * from @test as A
where A.product_date = (select MAX(B.product_date)
from @test as B
where A.store_nbr = B.store_nbr)

And in SQL Server, one can also use the CROSS APPLY function in SQL Server 2005.  We will look at the execution plans and the performance characteristics of these different options in an upcoming post.

One Response to “Common Problem – getting the first in a Group”

  1. […] This can be done in many different ways and kind of matches up with the first in a group type of scenario which we had described here. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: