Systems Engineering and RDBMS

TOP (N) clause in SQL Server 2005

Posted by decipherinfosys on August 31, 2007

In SQL Server 2005, Microsoft enhanced the TOP clause to be used in the DML statements as well. Before 2005, TOP was allowed only in SELECT statements to restrict the output to the number of rows specified by the TOP operator. Now we can apply TOP operator to INSERT, UPDATE or DELETE statements as well.

Connect to the database using Query Analyzer or Management Studio and create the following table.

CREATE TABLE dbo.TEST
(
TEST_ID   INT IDENTITY(1,1) PRIMARY KEY,
TEST_NAME VARCHAR(50),
TEST_DATE DATETIME
)
GO

We would like to insert only 5 records in the table. In SQL Server 2000, we have to use following syntax.

INSERT INTO DBO.TEST(TEST_NAME)
SELECT TOP 5 TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
ORDER BY TABLE_NAME
GO

But in SQL Server 2005, we can use following syntax. We can even parameterize also as shown below in separate query.

INSERT TOP (5) INTO DBO.TEST(TEST_NAME)
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
ORDER BY TABLE_NAME
GO

OR even better (since we can use a parameter and change the value at execution time):

DECLARE @I INT
SET @I = 5

INSERT TOP (@I) INTO DBO.TEST(TEST_NAME)
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
GO

Pay close attention to syntax. In INSERT clause, value of TOP is enclosed with (). If you leave out parenthesis, you will get an error. Microsoft suggests to use () even for the SELECT also. Current syntax of ‘TOP n’ is supported only for backward compatibility. Parameterized TOP value gives lots of flexibility during DML operations.

Another thing worth noticing is ORDER BY clause of SELECT statement. When TOP is used in SELECT statement with ORDER BY clause, query will return TOP (n) rows from the ordered result set. In our very first example, first 5 Rows from the ordered result set will be inserted into TEST table. While in 2nd and 3rd example, where TOP (n) is used with an INSERT statement, arbitrarily 5 rows will be picked up from the ordered list and will be inserted. This does not guarantee that records inserted will always be first 5 records sorted as per the ORDER BY clause even if SELECT is ordered.  If we have to make sure that TOP (n) rows in correct sort order should be inserted than TOP (n) should always be used with SELECT statement.

Now we will update first three records.

UPDATE TOP (3) TEST
SET TEST_DATE = GETDATE()
GO

Above query will update three records in the table. We can also use TOP percent clause. Following is the delete statement which deletes 20 percent records from the table.

DELETE TOP (20) PERCENT test
FROM dbo.TEST t
INNER JOIN INFORMATION_SCHEMA.TABLES ist
ON ist.table_Name = t.TEST_NAME
GO

In SQL server 2000, we can achieve same results using SET ROWCOUNT syntax. Our update statement shown above can be written as shown below if we are using SQL Server 2000.

SET ROWCOUNT 3
UPDATE TEST
SET TEST_DATE = GETDATE()
SET ROWCOUNT 0
GO

One major pitfall of the above query is row count remains in effect unless specifically it is set back to 0. If we forget to set row count back to 0, all SQL statements defined after setting of non-zero row count value will be affected. This can create unwanted results in the application. In such a scenario, TOP (n) can be very useful.

Another very important use of TOP (n) in DML statements is to limit the batch size when we are performing update or delete on large number of rows say more than a million. Updating or deleting million records can create lock timeout/ lock escalation issues eventually resulting into blocking issues. Moreover transaction log also gets filled up rapidly. By limiting records to a specific batch size of 10000 or 20000 records, not only we can reduce the locking contention, it also speeds up the query as it has to work with a smaller subset of the data.

One Response to “TOP (N) clause in SQL Server 2005”

  1. […] recently that we would like to share on the blog. The reader was using SQL Server and was using a TOP N clause in the query with an order by on a non-unique column. The results of these two […]

Sorry, the comment form is closed at this time.

 
%d bloggers like this: