Systems Engineering and RDBMS

OPTIMIZE FOR query hint in SQL Server 2005

Posted by decipherinfosys on October 23, 2007

In one of our previous blog post, we had covered bind variable peeking (Oracle) and parameter sniffing (SQL Server) behavior that happens when one is using a bind variables in SQL queries on data-sets that are not uniformly distributed. One can choose to use plan guides (SQL Server) and stored outlines (Oracle) in order to enforce a specific query plan if needed for such queries or for these specific scenarios, one can also choose to go with the use of constant/literal values instead of the bind variables. That is one scenario where in a typical transactional system, the usage of bind variables might not yield the desired results.

In this post, we will discuss a new query hint that was introduced in SQL Server 2005 which is called the “OPTIMIZE FOR” query hint. This is another way to address the parameter sniffing problem in SQL Server. This option ties in with the usage of the plan guides or can be used independently to alter the execution plan formation by the optimizer. This query hint instructs the optimizer to use a particular value for a local variable when the query goes through the optimization phase.

Let’s look at this using an example:

CREATE TABLE DECIPHER_TEST (COL1 INT IDENTITY NOT NULL, STAT_CODE INT NOT NULL)
GO

CREATE INDEX DECIPHER_TEST_IND_1 ON DECIPHER_TEST (STAT_CODE)
GO

SET NOCOUNT ON
GO

DECLARE @I INT
SET @I = 1
WHILE (@I <= 1000000)
BEGIN
IF @I <=10
INSERT INTO DECIPHER_TEST (STAT_CODE) VALUES (0)

IF (@I > 10 AND @I <= 10000)
INSERT INTO DECIPHER_TEST (STAT_CODE) VALUES (30)

IF (@I > 10000 AND @I <= 1000000)
INSERT INTO DECIPHER_TEST (STAT_CODE) VALUES (90)

SET @I = @I + 1
END

We used the above script to create a test table in our database and populated it with 1 million rows. Here is the data distribution:

SELECT STAT_CODE, COUNT(*) CNT_DISTR FROM DECIPHER_TEST GROUP BY STAT_CODE

STAT_CODE   CNT_DISTR
 ----------- -----------
 0           10
 30          9990
 90          990000

And stats were collected:

UPDATE STATISTICS DECIPHER_TEST WITH SAMPLE 100 PERCENT, ALL

Now, let us use a parameterized query and use the value of 90:

SET SHOWPLAN_TEXT ON
GO

DECLARE @STAT_CODE INT
SET @STAT_CODE = 90

SELECT * FROM DECIPHER_TEST WHERE STAT_CODE = @STAT_CODE

StmtText
——————————————————————————————————————————————
|–Table Scan(OBJECT: ([AdventureWorks].[dbo].[DECIPHER_TEST]), WHERE: ([AdventureWorks].[dbo].[DECIPHER_TEST].[STAT_CODE]=[@STAT_CODE]))

The optimizer does a complete table scan as it should since the value of 90 qualifies for a majority of the data so going through an index gives no benefits whatsoever. Now, let us replace that with the STAT_CODE value of 0 and see whether the optimizer changes the execution plan to go through the index this time since that value qualifies for only 0.001% of the records in the table.

DECLARE @STAT_CODE INT
SET @STAT_CODE = 0

SELECT * FROM DECIPHER_TEST WHERE STAT_CODE = @STAT_CODE

StmtText
——————————————————————————————————————————————
|–Table Scan(OBJECT: ([AdventureWorks].[dbo].[DECIPHER_TEST]), WHERE: ([AdventureWorks].[dbo].[DECIPHER_TEST].[STAT_CODE]=[@STAT_CODE]))

As you can see, the execution plan still remains the same as before – that is because the execution plan for the parameterized query was cached when the previous run was done. Now, let us make use of the “OPTIMIZE FOR” query hint and see if we can force the optimizer to optimize the query for the STAT_CODE value of 0:

DECLARE @STAT_CODE INT
SET @STAT_CODE = 0

SELECT * FROM DECIPHER_TEST WHERE STAT_CODE = @STAT_CODE OPTION (OPTIMIZE FOR (@STAT_CODE = 0))

StmtText
————————————————————————————————————————————————————————————
|–Nested Loops(Inner Join, OUTER REFERENCES: ([Bmk1000]))
|–Index Seek(OBJECT: ([AdventureWorks].[dbo].[DECIPHER_TEST].[DECIPHER_TEST_IND_1]), SEEK: ([AdventureWorks].[dbo].[DECIPHER_TEST].[STAT_CODE]=[@STAT_CODE]) ORDERED FORWARD)
|–RID Lookup(OBJECT: ([AdventureWorks].[dbo].[DECIPHER_TEST]), SEEK: ([Bmk1000]=[Bmk1000]) LOOKUP ORDERED FORWARD)

As you can see from the execution plan shown above, the optimizer picked up the index this time and did a RID look-up to get the data that was not in the index. This makes sense since the selectivity of the data value is such that it should favor an index seek operation. So, what would happen now if we execute the query with a value of 90? Since we have forced the optimizer to always form an optimized execution plan for the given value of 0 for the status code, when we use 90, it will still use the index plan that is shown above. So, how is this good? In this specific scenario, say the STAT_CODE values represent this:

0: Record is ready to be processed
30: Interim status code for processing
90: Processing was successfull
99: Failure/Error

In a majority of the cases, the code that interacts with this table will be doing look-ups based on the status code value of either 0 or 30. Places where 90 or 99 will be used will be typically only in queries that check for the data or in reports and those will be getting executed less number of times as compared to the transactional system queries. So, in the actual transactional system where one needs to parse once and execute many times, one can still make use of the bind variables (parameterized queries) and have an execution plan that best represents the data distribution and the optimal execution plan and in reporting system, one can use a query based on either constants/literals or one without the “OPTIMIZE FOR” hint. Here is an execution with a constant/literal value execution of the same query and it’s execution plan as rendered by the optimizer:

SELECT * FROM DECIPHER_TEST WHERE STAT_CODE = 90

StmtText
———————————————————————————————————————————-
|–Table Scan(OBJECT: ([AdventureWorks].[dbo].[DECIPHER_TEST]), WHERE: ([AdventureWorks].[dbo].[DECIPHER_TEST].[STAT_CODE]=(90)))

2 Responses to “OPTIMIZE FOR query hint in SQL Server 2005”

  1. […] the usage of this new hint. We had covered that in some of our posts before – you can read them here and […]

  2. […] on February 16, 2009 We had covered the “Optimize For” query hint in one of blog post before and had covered what it does and where it can be of use.  We had also discussed bind […]

Sorry, the comment form is closed at this time.

 
%d bloggers like this: