Systems Engineering and RDBMS

Archive for June 21st, 2008

Analytic Functions and sorting on a constant value

Posted by decipherinfosys on June 21, 2008

We have covered analytic functions (ROW_NUMBER(), RANK(), DENSE_RANK() etc.) in our posts and whitepapers before. A requirement for these window functions also is that you need to have an order by clause and for obvious reasons – the values that are generated are based on a particular order of columns (and if there is a partition by clause also, then the order of the column(s) within that partition).

Recently, while working on tuning a particular query at a client engagement, we saw that the client really did not care about the order by column and wanted to just get the row numbers in any order. In which case, there was really no need to pay the sort penalty by including the order by colx in the the SQL code. Let’s see whether that is even possible:

set nocount on
go
create table dbo.test (col1 int not null identity, colx nvarchar(10) not null)
alter table test add constraint pk_test primary key (col1)

declare @i int, @j int
select @i = 1, @j = 10
while (@i <= @j)
begin
insert into dbo.test (colx) select name from sys.objects
set @i = @i + 1
end

And now, let’s try to see the execution plan of this query:

select ROW_NUMBER() over (order by colx) as RN, * from dbo.test

StmtText
——————————————————————————————
|–Sequence Project(DEFINE:([Expr1003]=row_number))
|–Segment
|–Sort(ORDER BY:([DECIPHER_TEST].[dbo].[test].[colx] ASC))
|–Clustered Index Scan(OBJECT:([DECIPHER_TEST].[dbo].[test].[pk_test]))

You will see the sort operation above, as expected. One can argue that we can create an index and minimize the sort costs:

create index test_ind_1 on dbo.test (colx)
/*filegroup clause*/
go

Now, the execution plans looks like this:

StmtText
———————————————————————————————–
|–Sequence Project(DEFINE:([Expr1003]=row_number))
|–Segment
|–Index Scan(OBJECT:([DECIPHER_TEST].[dbo].[test].[test_ind_1]), ORDERED FORWARD)

The cost of sorting is still there though – and this is oversimplifying the issue from the real world since in real world queries, one has many tables joined together with filter conditions etc.. So, why pay the price of sorting when we don’t even need it – remember that in this case the requirement was such that the client did not care about the order in which those row numbers were generated.

So, let’s see if we can avoid it by sorting on a constant value:

select ROW_NUMBER() over (order by 1) as RN, * from dbo.test

This time, we will get an error:

Msg 5308, Level 16, State 1, Line 1
Windowed functions do not support integer indices as ORDER BY clause expressions.

There is a work around for this issue. Instead of just doing an order by 1, we can do a order by (select 1) and that way, the query will be valid:

select ROW_NUMBER() over (order by (select 1)) as RN, * from dbo.test

And here is the execution plan for it:

Without the index:

StmtText
——————————————————————————————
|–Sequence Project(DEFINE:([Expr1005]=row_number))
|–Segment
|–Compute Scalar(DEFINE:([Expr1004]=(1)))
|–Clustered Index Scan(OBJECT:([DECIPHER_TEST].[dbo].[test].[pk_test]))

and with the index:

StmtText
———————————————————————————–
|–Sequence Project(DEFINE:([Expr1005]=row_number))
|–Segment
|–Compute Scalar(DEFINE:([Expr1004]=(1)))
|–Index Scan(OBJECT:([DECIPHER_TEST].[dbo].[test].[test_ind_1]))

And as you can see after comparing these execution plans to the before execution plans, the cost of sorting has been taken out. This was a very specific case for a very specific query in question – typically, one would always want to generate the numbers based on a particular order of a column (or columns) but in case you ever run into a situation like we did, the above solution will work for you and will also ensure that you do not have to pay the price of sorting on a column unnecessarily.

Posted in SQL Server | 2 Comments »

Some more acronyms for you

Posted by decipherinfosys on June 21, 2008

Ahhh… Acronyms – what will the IT world be without these wonderful acronyms. Yesterday, a good friend mentioned that he is working on a LAMP project. So, do you know what it stands for?

LAMP stands for:

L –> Linux

A –> Apache

M –> MySQL

P –> PHP

That’s the complete open source platform that he was using. Likewise, if you are using only a Windows based implementation, another acronym to be familiar with is WISA:

WISA stands for:

W –> Windows

I –> IIS

S –> SQL Server

A –> ASP.Net

Then, there are different permutations of these like: WISP or WASP or WIMP or WIMA (the alphabets stand for the same technologies as stated above).

Posted in Open Source, SQL Server, Windows | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 83 other followers