Systems Engineering and RDBMS

Archive for July 31st, 2007


Posted by decipherinfosys on July 31, 2007

If you are an Oracle DBA, then chances are that you will be familiar with the DBMS_STATS package and its most common subroutines to collect the statistics for entire database, for specific schema or for a specific user. If not, then we would advise taking a look at it in lieu of the ANALYZE command – to learn, why using DBMS_STATS over ANALYZE command is better to collect the statistics, please look at our previous blog post.

Using the DBMS_STATS package, we can even delete the statistics, export and import statistics from one database to another in case we need to reproduce specific performance problem. In this blog, we are going to talk about how we can trick the optimizer by using one of the subprograms of the DBMS_STATS package.

Normally, when we are performing benchmark tests or are performing load tests, one of the major requirements is to make sure that query uses an optimal execution plan for the queries in the application. A normal task for the DBA or the database developer then is to create volume data for the benchmark to take place. And that works well for the automated benchmarks tests done via load runner or other load test suites.

However, if a developer wants to also test their code for performance when doing development, one cannot just get into the habit of creating volume data each and every time that there is a change to a query or new code is being written. It would be useful to have an option using which one can take a look at how the query execution plan changes as the data set increases. Here SET_TABLE_STATS procedure comes to a rescue. Using this procedure, one can set number of rows and number of blocks with some large number, which will make the optimizer think that there are large number of rows in the table and the data distribution is different. When the query is run against these new values, optimizer may change the query plan based on the available data. There are similar procedures to set index level and column level statistics as well. Please refer to Oracle manuals for complete details on other procedures. Let us check this with the help of working example. First we will create table. Connect to appropriate schema with proper authentication using SQL*Plus.

2 (
6 )
7 /

Table created.

2 /

Index created.

Now let us set the trace in traceonly mode and check the execution.


Execution plan for query is as under.

Execution Plan
Plan hash value: 1357081020

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 30 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST | 1 | 30 | 2 (0)| 00:00:01 |

Predicate Information (identified by operation id):

1 – filter(“TEST_NAME”=’DECIPHER’)

Plan indicates that even though index is present, full table access is performed which is what is expected since the cost of scanning through the table would be lesser than going through the index and using a rowid to then go through the table. Now let us set the number of rows and number of blocks value to some higher number and let us see whether optimizer changes its mind about access path or not. First we will set the values using SET_TABLE_STATS stored procedure.

SQL> exec dbms_stats.set_table_stats( user, ‘TEST’, numrows => 10000, numblks => 1000 );

PL/SQL procedure successfully completed.

In above execution, we are setting number of rows to 10000 and number of blocks to 1000. Based on this data, optimizer may choose to have different plan. Now we will re-execute the same query again. Below is the query followed by an execution plan.


Execution Plan
Plan hash value: 272677607

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|Time |
| 0 | SELECT STATEMENT | | 100 | 3000 | 5 (0)|00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 100 | 3000 | 5 (0)|00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_IND_1 | 40 | | 1 (0)|00:00:01 |

Predicate Information (identified by operation id):

2 – access(“TEST_NAME”=’DECIPHER’)

In our example, when we updated table statistics value, optimizer opted for an index scan rather than full table scan.

This method can be used to update statistics on temporary tables as well to increase the performance when temporary tables are used for larger data set. Oracle guru Tom Kyte has explained how we can utilize statistics for temporary tables in detail at his site – here and here.

By no means, this procedure should be used to simulate the production volume by merely setting the number of rows, distinct values or number of blocks for a specific table, index or columns. This only helps us to figure out what plan my query can have or how the join conditions will change the plan if data volume changes. For proper testing, one must create appropriate data volume in all the tables which reflects the production data volume and perform thorough load and performance testing to have smooth role out. However, having this option is very handy for database developers to ensure code performance at design and development time.

In a future post, we will cover how one can do this in SQL Server (both 2000 and 2005).

Posted in Oracle | 3 Comments »