Systems Engineering and RDBMS

Using a SELECT 1 vs a SELECT * in the EXISTS/NOT EXISTS query

Posted by decipherinfosys on February 13, 2008

Could not come up with a better title for this post. After reading some of our blog posts, one of our readers asked the question – “Why do you always make use of a “SELECT 1” rather than a “SELECT *” when using an EXISTS or a NOT EXISTS clause”.

The answer is that during the query plan stages, the * is expanded to bind with the list of columns. And then since the semantics of the query is such (EXISTS or NOT EXISTS), since it does not need any of those columns, it will remove all of them. Using a SELECT 1, avoids having to look at any of the meta-data that is not even needed for that table during query compilation time.

Please do NOTE that at runtime BOTH of them will have the same plans. We can quickly check that with an example (using SQL Server Syntax):

CREATE TABLE TBL1 (COL1 INT IDENTITY PRIMARY KEY, COL2 INT)
GO
CREATE TABLE TBL2 (COL3 INT IDENTITY PRIMARY KEY, COL4 INT, COL5 INT, COL6 INT, COL7 INT, COL8 INT, COL9 INT, COL10 INT, CONSTRAINT FK_TBL2_TO_TBL1 FOREIGN KEY (COL4) REFERENCES TBL1(COL1))
GO

CREATE INDEX TBL2_IND_1 ON TBL2 (COL4)
GO

SET NOCOUNT ON
GO
DECLARE @I INT, @J INT
SELECT @I = 1, @J = 1000
WHILE (@I <= @J)
BEGIN
INSERT INTO TBL1 (COL2) VALUES (@I)

IF @I%7 = 0
INSERT INTO TBL2 (COL4, COL5) VALUES (@I, @I + 10)

SET @I = @I + 1
END
GO

SET SHOWPLAN_TEXT OFF
GO

SELECT * FROM TBL1 AS A
WHERE EXISTS (SELECT * FROM TBL2 WHERE COL4 = A.COL1)

———————————————————————————————————————————————————————–
|–Merge Join(Left Semi Join, MERGE:([A].[COL1])=([master].[dbo].[TBL2].[COL4]), RESIDUAL:([master].[dbo].[TBL2].[COL4]=[master].[dbo].[TBL1].[COL1] as [A].[COL1]))
|–Clustered Index Scan(OBJECT:([master].[dbo].[TBL1].[PK__TBL1__17E28260] AS [A]), ORDERED FORWARD)
|–Index Scan(OBJECT:([master].[dbo].[TBL2].[TBL2_IND_1]), ORDERED FORWARD)

SET SHOWPLAN_TEXT OFF
GO

DBCC FREEPROCCACHE
GO

SET SHOWPLAN_TEXT ON
GO

SELECT * FROM TBL1 AS A
WHERE EXISTS (SELECT 1 FROM TBL2 WHERE COL4 = A.COL1)

StmtText
———————————————————————————————————————————————————————–
|–Merge Join(Left Semi Join, MERGE:([A].[COL1])=([master].[dbo].[TBL2].[COL4]), RESIDUAL:([master].[dbo].[TBL2].[COL4]=[master].[dbo].[TBL1].[COL1] as [A].[COL1]))
|–Clustered Index Scan(OBJECT:([master].[dbo].[TBL1].[PK__TBL1__17E28260] AS [A]), ORDERED FORWARD)
|–Index Scan(OBJECT:([master].[dbo].[TBL2].[TBL2_IND_1]), ORDERED FORWARD)

As you can see from above, the execution plan at runtime are exactly the same regardless of whether a “SELECT 1” is used or a “SELECT *” is used. We use SELECT 1 because of the reason mentioned in the start of this post.

One Response to “Using a SELECT 1 vs a SELECT * in the EXISTS/NOT EXISTS query”

  1. […] Some examples I see uses SELECT * instead of SELECT 1. But some argue that it is better to use SELECT 1. For further reading on SELECT 1 vs SELECT * : Using a SELECT 1 vs a SELECT * in the EXISTS/NOT EXISTS query […]

Sorry, the comment form is closed at this time.

 
%d bloggers like this: