Systems Engineering and RDBMS

Archive for December 9th, 2008

Using NUMERIC(p,s) for the Identity Column

Posted by decipherinfosys on December 9, 2008

One of our readers had asked this question about a design that he saw in a schema that he had inherited:

“In the schema that I inherited, the Primary Key column is of data type NUMERIC(7,0) and has the Identity property associated with it. Is there an overhead of doing this vs using INT? Also, I was not aware that one could use the Identity property for data types other than INT and BIGINT. What other data types can it be used with?”

The answer to the last part is that the Identity column must be of data type int, bigint, smallint, tinyint, or decimal or numeric with a scale of 0, and constrained to be non-nullable. The answer to the first part is: “It Depends” 🙂

As you know, Numeric(p, s) has a precision (p) and a scale (s) part. The storage of the data depends upon the precision used for the numeric data type –

Precision		Bytes Required
1-9		 		 5
10-19				 9
20-28				13
29-38				17

An INT requires 4 bytes and BIGINT requires 8 bytes for storage for a single value. Now, it is not just the storage increase that becomes an issue in the case of a large table – if you have an index on this column – this means that you can store less records per index page since the page size is fixed and your storage is high. The smaller the data type storage, the more records you can fit into a page (both data as well as the index page) which means lesser read I/O in the case of range scans. If the application never does range scans and always looks up for a particular value only, it does not matter performance wise.

INT allows for data in the range of 2^31 (-2,147,483,648) to 2^31-1 (2,147,483,647). BIGINT is much more.

So, coming back to the question, if you are using NUMERIC(7,0) instead of INT, was the intent of the designer/developer/DBA to restrict the values up-to 7 digits only? Does not seem like it since you also have this as an auto-incremental number which means it is a surrogate number. Was the intent to maintain the same precision as allowed by other RDBMS like Oracle – Oracle has the NUMBER data type and NUMBER(7) correctly translates to NUMERIC(7,0) in the case of SQL Server. So, was this design done to accommodate the migration and keep it the same as the other RDBMS for compatibility?

If the answer to both the questions is No – then you are incurring an un-necessary overhead. Also, if the application does do range scans and the tables are huge, then also you are incurring an overhead in terms of I/Os.

Posted in SQL Server | 1 Comment »