Systems Engineering and RDBMS

TOP N clause giving different results even with the same order by clause

Posted by decipherinfosys on June 23, 2008

One of our readers had asked a question recently that we would like to share on the blog. The reader was using SQL Server and was using a TOP N clause in the query with an order by on a non-unique column. The results of these two statements:

select top 10 * from x, y where x.col1 = y.col2 order by x.col3 asc

select top 20 * from x, y where x.col1 = y.col2 order by x.col3 asc

showed that in the results from the second query, the top 10 records were not always the same as the top 10 records returned by the first query. The explanation for that is that TOP N clause is deterministic only when you are sorting based on a unique column. If you are going to be sorting on a non-unique column, then the TOP N query is non deterministic.

So, the solutions were two fold:

a) Include the unique column in the sort after the actual column on which you want to sort on. That way, there is no change to your functionality.

b) If you still want to sort on the non-unique column, then you can choose to include the “WITH TIES” clause to make TOP N be deterministic. What this does is that in the above query, if you wanted top 20 rows ordered by col3, it will give you additional rows as well that match the value of col3 in those top 20 rows. This is not always an acceptable solution for obvious reasons. For the reader, the solution (a) worked out well.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: