Scripting Statistics and Histograms in SQL Server
Posted by decipherinfosys on December 2, 2008
In one of previous posts, we had discussed the differences between statistics and indexes and had also covered in another post, how one can make the optimizer think that the data distribution is different than what it actually is in the database – this is especially useful for simulating production/benchmark issues. You can read more on it here.
While working on a client project one of my colleagues pointed out that one can very easily script out the stats and the histograms using the scripting wizard. I typically never use wizards and like to know exactly what they do behind the scenes (for learning purposes, I trace them out to see exactly what gets fired on the DB side) and had not noticed this option existed in SQL Server 2005. So, if you open up SSMS and right click on a database and go to Tasks/Generate Scripts and select “Script all objects in the selected database” check box on the very first screen. When you click next, you will see the different options and you will see an option for scripting statistics which by default is off:
And when you click on the drop down, you will be presented with these three options:
And when I select the first option, it will give me a message that it will increase the scripting time – just click on OK and move on 🙂 :
The next thing to do would be to go under the Table/View Options and set the “Script Indexes” to be true:
And then you can either script it to a file or a new query window. Once the output is presented, you will see Update Statistics commands like these:
UPDATE STATISTICS [owner].[tablename]([IndexName])
WITH STATS_STREAM = you will see a hex value here,
ROWCOUNT = 10000,
PAGECOUNT = 35
So, when these commands run, even if you do not copy over the records for that table from the source to the destination, the stats command will put the histograms into the stats as scripted from the source and will also update the rowcount and the pagecount. And as we had indicated in the previous post where we had covered the Update Statistics options – you can use it to simulate your production load to test how the execution plan changes based on the data distribution – all that without having to actually copy over the terabytes of data that you might have in the source database.