TOP N clause giving different results even with the same order by clause
Posted by decipherinfosys on June 23, 2008
One of our readers had asked a question recently that we would like to share on the blog. The reader was using SQL Server and was using a TOP N clause in the query with an order by on a non-unique column. The results of these two statements:
select top 10 * from x, y where x.col1 = y.col2 order by x.col3 asc
select top 20 * from x, y where x.col1 = y.col2 order by x.col3 asc
showed that in the results from the second query, the top 10 records were not always the same as the top 10 records returned by the first query. The explanation for that is that TOP N clause is deterministic only when you are sorting based on a unique column. If you are going to be sorting on a non-unique column, then the TOP N query is non deterministic.
So, the solutions were two fold:
a) Include the unique column in the sort after the actual column on which you want to sort on. That way, there is no change to your functionality.
b) If you still want to sort on the non-unique column, then you can choose to include the “WITH TIES” clause to make TOP N be deterministic. What this does is that in the above query, if you wanted top 20 rows ordered by col3, it will give you additional rows as well that match the value of col3 in those top 20 rows. This is not always an acceptable solution for obvious reasons. For the reader, the solution (a) worked out well.