Systems Engineering and RDBMS

Export/Import of Optimizer Statistics in Oracle

Posted by decipherinfosys on February 19, 2008

We had talked about DBMS_STATS.SET_TABLE_STATS in one of our previous Oracle blog posts and had also covered the equivalent options in SQL Server 2005 in another post. That post also pointed out to the article by Joe Chang about the export/import of statistics in a SQL Server 2000 environment since the options of having the optimizer consider a different data distribution did not exist in that version. Based on that post, a reader had asked whether the export/import of statistics is feasible in Oracle as well and Yes – it is feasible and this post is to address that question.

As you know, using DBMS_STATS package we can modify the database statistics gathered for objects (tables/indexes etc) in the Oracle dictionary (This package is also used to generate statistics – you can search for DBMS_STATS on our blog to get all the related posts on the topic). These statistics can reside in the dictionary or in a table created in User’s schema. We can also collect and maintain a history of these statistics in the user-defined statistic table(s). We can also import statistics from a user-defined statistic table into Oracle dictionary tables to change any execution plans.

NOTE: Please resort to this approach only for troubleshooting performance issues or trying to see how the plan changes if the data distribution is different. That is, only if you are seeing sub-optimal execution plans for your queries when you know that a different plan should be used rather than the one Oracle came up with. This process is solely intended for troubleshooting only.

Please note that there are other “better” ways to force Oracle to use an execution plan that we want it to use, example: using stored outlines, using hints, turning bind value peeking off at the session level or statement level if that is creating an issue etc.

1. Steps to Export/Import Optimizer Statistics

The following steps explain how to maintain (export/import) statistics in a user-defined statistic tables. We can maintain different sets of statistics with in a single user-defined table also. In our example we will be using CASE_HDR table to collect and maintain statistics.

1.1. Creating User-defined Statistic table
Create a user-defined table to store and maintain statistics for a given table. It is advised to associate a user-defined statistic table for every physical table for which you want to collect and maintain statistics.

DBMS_STATS.CREATE_STATS_TABLE (ownname => USER, stattab => ‘CASE_HDR_STAT_TAB’, tblspace => ‘LLMDATA’);

This procedure creates a table with name CASE_HDR_STAT_TAB in current schema that can be used to store statistics. The columns that compose this table are not relevant as they are solely accessed through the DBMS_STATS package. Please make sure that the schema/user you are using have privileges to create tables and also has quota on the tablespace mentioned.

1.2. Exporting Statistics into User-defined table

There are two ways of collecting statistics into the user-defined table using either DBMS_STATS.GATHER_TABLE_STATS or DBMS_STATS.EXPORT_TABLE_STATS procedures. When use the cascade option with these procedures, you will be able to collect statistics on tables, columns and indexes.

1.2.1. Using DBMS_STATS.GATHER_TABLE_STATS procedure

When you run the DBMS_STATS.GATHER_TABLE_STATS procedure you will be generating the statistics on CASE_HDR table. These statistics will be stored in both Oracle dictionary tables and the user-defined statistic table. If you omit stattab and statid arguments, you will only be generating and storing the statistics in the Oracle dictionary tables. Using the statid argument allows you to store multiple sets of statistics. Please make sure that you use a unique value for this argument.

DBMS_STATS.GATHER_TABLE_STATS(ownname => USER, tabname => ‘CASE_HDR’, method_opt => ‘for all indexed columns’, cascade => true, stattab => ‘CASE_HDR_STAT_TAB’, statid => 1);

1.2.2. Using DBMS_STATS.EXPORT_TABLE_STATS procedure

When you run the DBMS_STATS.EXPORT_TABLE_STATS procedure you will be retrieving statistics from Oracle dictionary tables for table CASE_HDR and will be storing them in the user-defined statistic table. Using the statid argument allows you to store multiple sets of statistics. Make sure that you use a unique value of the statid argument.

DBMS_STATS.EXPORT_TABLE_STATS(ownname => USER, tabname => ‘CASE_HDR’, stattab => ‘CASE_HDR_STAT_TAB’, statid => 1, cascade => true);

1.3. Importing Statistics from User-defined table into Dictionary

Using DBMS_STATS.IMPORT_TABLE_STATS procedure you can retrieve statistics for a particular table from the user-defined stat table identified by the stattab argument and statid argument (if multiple sets of statistics are available) and store them in Oracle dictionary tables. Setting the cascade argument to TRUE allows you store index and column stats associated with the specified table as well. Also consider setting the no_invalidate argument to TRUE, so that any current dependent cursors that are using the base table are not invalidated. Set the no_invalidate argument to FALSE if you want the dependent cursors to be invalidated.

DBMS_STATS.IMPORT_TABLE_STATS(ownname => USER, tabname => ‘CASE_HDR’, stattab => ‘CASE_HDR_STAT_TAB’, statid => 1, cascade => true, no_invalidate => true);

2. Test Case

In our test case we will be using a table called CASE_HDR table to demonstrate how the plan changes when we manipulate the available optimizer statistics in the dictionary tables.  This table is used in one of our client’s system to store all the data pertaining to cases in the warehouse.  We will be using the CASE_HDR.STAT_CODE column values (filtering/WHERE clause) and index on the STAT_CODE column.

  • To start with we have populated the table with following distribution of data which is a good representation of the data in this table in a production environment:
STAT_CODE   COUNT(*)
 ---------- ----------
 30       3423
 45       1233
 50        323
 90      23456
 95       2655

The execution plan produced when we run the following query:

SQL> select stat_code, count(*) from case_hdr where stat_code = 95 group by stat_code;

-----------------------------------------------------------------------------
 | Id  | Operation            |  Name           | Rows  | Bytes | Cost (%CPU)|
 -----------------------------------------------------------------------------
 |   0 | SELECT STATEMENT     |                 |     1 |     3 |     3  (34)|
 |   1 |  SORT GROUP BY NOSORT|                 |     1 |     3 |     3  (34)|
 |*  2 |   INDEX RANGE SCAN   | CASE_HDR_IND_4  |  2655 |  7965 |     8   (0)|
 -----------------------------------------------------------------------------
  • Before we add more data to the CASE_HDR table, we will gather and save the statistics in the user-defined stat table:

Step 1:
Create the user-defined stat table:

SQL> EXEC DBMS_STATS.CREATE_STATS_TABLE (ownname => USER, stattab => ‘CASE_HDR_STAT_TAB’, tblspace => ‘LLMDATA’);

LLMDATA is the Large Locally Managed DATA tablespace.

Step 2:
Gather statistics on the table and also save the statistics in the user-defined stat table:

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(ownname => USER, tabname => ‘CASE_HDR’, method_opt => ‘for all indexed columns’, cascade => true, stattab => ‘CASE_HDR_STAT_TAB’, statid => 1);

  • Now we will add considerable amount of data (> 10 million records) in the CASE_HDR table for STAT_CODE = 95, gather statistics in Oracle dictionary tables and then export the statistics into the user-defined stat table (just to demonstrate the usage of both GATHER_TABLE_STATS and EXPORT_TABLE_STATS procedures)

Step 1:
Insert > 10 million rows with STAT_CODE = 95 into CASE_HDR table

Step 2:
Gather statistics on the newly inserted data.

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(ownname => USER, tabname => ‘CASE_HDR’, method_opt => ‘for all indexed columns’, cascade => true);

Step 3:
Export the statistics into the user-defined stat table from the dictionary tables.

SQL> EXEC DBMS_STATS.EXPORT_TABLE_STATS(ownname => USER, tabname => ‘CASE_HDR’, stattab => ‘CASE_HDR_STAT_TAB’, statid => 2, cascade => true);

Note that we have used a different statid value (second set of statistics) for newer statistics.

Following is the new execution plan for the data present in CASE_HDR table when we re-run the query:

-------------------------------------------------------------------------
 | Id  | Operation             |  Name           | Rows  | Bytes | Cost  |
 -------------------------------------------------------------------------
 |   0 | SELECT STATEMENT      |                 |     1 |     3 |  6917 |
 |   1 |  SORT GROUP BY NOSORT |                 |     1 |     3 |  6917 |
 |*  2 |   INDEX FAST FULL SCAN| CASE_HDR_IND_4  |    10M|    28M|  6917 |
 -------------------------------------------------------------------------

Notice the change in the execution method.

  • Now we will import the older statistics we have (where statid = 1) from the user-defined stat table back into Oracle dictionary tables and see how the execution plan manifests.

SQL> DBMS_STATS.IMPORT_TABLE_STATS(ownname => USER, tabname => ‘CASE_HDR’, stattab => ‘CASE_HDR_STAT_TAB’, statid => 1, cascade => true, no_invalidate => true);

Re-running the query shows the earlier execution plan:

-----------------------------------------------------------------------------
 | Id  | Operation            |  Name           | Rows  | Bytes | Cost (%CPU)|
 -----------------------------------------------------------------------------
 |   0 | SELECT STATEMENT     |                 |     1 |     3 |     3  (34)|
 |   1 |  SORT GROUP BY NOSORT|                 |     1 |     3 |     3  (34)|
 |*  2 |   INDEX RANGE SCAN   | CASE_HDR_IND_4  |  2655 |  7965 |     8   (0)|
 -----------------------------------------------------------------------------

Even though we have more than 10 million records for the STAT_CODE values equal to 95, we have making the Oracle optimizer believe that there are only 2655 records in the table and use our older plan.

So, when can I use this approach ?

You can use this approach during development stages to see how your code would perform under a different data distribution. Since the development data typically would be very less as compared to the production data, you can simulate the actual environment by using an export/import of stats from production and load it up in your development environment. Assuming that the rest of the stuff (Oracle parameter settings, hardware settings are the same), you can then troubleshoot your programs and your SQL code.

An alternate way of doing that (and a better way) is to use DBMS_STATS.SET_TABLE_STATS as we had shown before. This post is just to show that export/import of stats is another option.

Sorry, the comment form is closed at this time.

 
%d bloggers like this: