Systems Engineering and RDBMS

Chosing between a CLOB or a VARCHAR2 data-type

Posted by decipherinfosys on March 17, 2007

One of the recent projects that we worked on had a requirement of storing strings which sometimes might cross the 4K limit but that was not a frequent occurence.  The question asked by the client was :

a) whether to de-normalize the design and store it in two VARCHAR2(4000) data-type columns, or

b) to use the CLOB data-type, or

c) yet another option of using the same column but another table to chunk the data into two or more records.  This option was being proposed since the client application code did not support handling of the CLOB data-types.  But there is a big problem with chunking of the records since this field can be searched upon as well and while searching for strings like this, it means concatenating the different fields together in order to look for the exact match of the string – though there are ways to optimize that search (by just concatenating the searchable string length + the next row) when data is stored in such a chunking fashion, it is far from optimal.

The answer was that today’s need is 4K,  a few months from now, it can exceed 8K as well since this is a healthcare system and data length can very easily change since this is related to patient’s records and the provider might be interested in tracking more information in the days to come.  A better solution thus was to use the CLOB data-type and not de-normalize the design.   A CLOB up-to 4K is stored inline as a VARCHAR2.  When it exceeds the 4K limit, theCLOB will be moved out of line.  Hence, in majority of the scenarios in the above applicatio, it will behave as a VARCHAR2 behind the scenes and in an infrequent case, it will be stored out of the line.  This design also allows for easy searching and ensures no application code changes if tomorrow the requirements for that field grow to say 16K.

Sorry, the comment form is closed at this time.

 
%d bloggers like this: