Systems Engineering and RDBMS

Handling a comma separated list in DB code

Posted by decipherinfosys on May 22, 2007

SQL Server MVP Erland Sommarskog has an excellent post about arrays and lists in SQL Server – he also covers one of the solutions for handling comma separated list in DB code and covers the different options that are available (XML, comma separated lists, IN operators etc.).  It is an excellent read and is very well written with the pros and cons of each option.  Let’s take one of the examples that he had and extend it to Oracle as well.

Let’s see the code first and then we can evaluate how this operates.  The driver of this whole logic is the pipeline function : “f_SequenceNumbers”.  It is based on a type and it’s purpose is to return a list of numbers from the starting number that is passed in to the end number that is passed in.  That logic helps in creating those new records in an array format so that the comma separated list can be converted from a big fat string to a column with different rows.

/*Oracle Syntax*/

CREATE OR REPLACE TYPE t_SequenceNumbers IS TABLE OF INT;
/

CREATE OR REPLACE FUNCTION f_SequenceNumbers(startNumber INT,
endNumber   INT)
RETURN t_SequenceNumbers PIPELINED
IS
BEGIN
FOR i IN startNumber..endNumber
LOOP
PIPE ROW(i);
END LOOP;

RETURN;
EXCEPTION
WHEN OTHERS THEN
RAISE_APPLICATION_ERROR(-20002, ‘Error in getting Sequence Numbers’ || CHR(10) || SQLERRM);
END f_SequenceNumbers;
/

So, an execution like:

select column_value row_num from table(f_sequencenumbers(1, 1000))

will return you a column called ROW_NUM with 1000 rows with the starting number as 1 and the ending number as 1000.  Now, if you have more than 1000 values in your comma separated string, then you can change this number to accordingly.

ROW_NUM
———-
1
2
3
4
5
6
7
8
9
10
11
.
.
.
1000

I doubt that anyone would ever want to pass more values than that (since if that is the case, there is bound to be something wrong with the design) but in case you do have a requirement to do so, you can just put the right number in the above function.  With this pipeline function in mind, now let’s take a look at the actual code that does this conversion:

set autoprint on
variable per1 refcursor;
declare
v_string varchar2(1000) := ‘E050160059,E050342378,E050384514,E050384518,E050384519,E050384520’;
begin
open :per1 for
SELECT ltrim(rtrim(substr(‘,’ || v_string || ‘,’, IV.Row_Num + 1,
INSTR(‘,’ || v_string || ‘,’, ‘,’, IV.Row_Num + 1) – IV.Row_Num – 1))) COL_VALUES
FROM   (select column_value row_num from table(f_sequencenumbers(1, 1000))) IV
WHERE  IV.Row_Num <= length(‘,’ || v_string || ‘,’) – 1
AND  substr(‘,’ || v_string || ‘,’, IV.Row_Num, 1) = ‘,’;
end;
/

–Output
COL_VALUES
—————-
E050160059
E050342378
E050384514
E050384518
E050384519
E050384520

6 rows selected.

The first thing to observe is the expression  ‘,’ || v_string || ‘,’  that re-occurs no less than four times. By adding the delimiter in beginning and at the end of the string, the first and last items in the string appear in the same context as all other items.

Next, let’s turn to the WHERE clause. The expression:

substr(‘,’ || v_string || ‘,’, IV.Row_Num, 1) = ‘,’

evaluates to TRUE for all positions in the string where the delimiter appears. The expression:

IV.Row_Num <= length(‘,’ || v_string || ‘,’) – 1

simply sets an upper limit of which numbers we are using.

Let’s now look at the SELECTed column Value. We extract the strings with substr, starting on the position after the delimiter. We find the length of the substr by searching for the next delimiter with INSTR, taking use of its third parameter which tells INSTR where to start searching. Once the next delimiter is found, we subtract the position for the current delimiter and then one more, as the delimiter itself should not be included in the extracted string, to get the length of the list item.  Finally, the code also puts a ltrim and a rtrim to make sure that in case there were leading and trailing spaces after/before the comma was specified, those are taken care of as well.

This approach can be used for creating arrays out of a comma separated list provided by the application and then this output can very easily be treated as if you were querying/joining or operating upon a table in itself and it makes it very easy to manipulate the records on this basis and apply conditional logic as needed.  Please do note that whenever possible, instead of passing a string of comma delimited values, one should strive to use that criteria that qualifies for that list of values and use that in the SQL.  In other words, if the application is going to pass say 1000 values in a comma separate string values, then if the application can specify the criteria (say stat_code = 30 and some_column_val = 1) which qualified those 1000 values, then that should be used in the SQL statements.  However, there are scenarios where such a criteria cannot be applied especially when the user can pick and choose the values that they need to process and the above piece of code will do the trick for you then.

4 Responses to “Handling a comma separated list in DB code”

  1. […] Handling a comma separated list in DB code […]

  2. […] and support of TYPES (this feature has been available in Oracle for ages now and will help make array processing and nesting of procedures a lot easier)., FILESTREAM (for better binary data storage) and there […]

  3. […] The table valued parameters. In order to pass an array into a stored procedure (see previous post – here), one has to use XML in SQL Server 2005 and then de-construct it using XQUERY features or OPENXML. […]

  4. […] before how to go about writing such a UDF – you can read more on it here (for SQL Server) and here (for Oracle).  For this post, we will use sample code for SQL Server 2008. USE TESTDB GO […]

Sorry, the comment form is closed at this time.

 
%d bloggers like this: