Systems Engineering and RDBMS

Generating a list of missing numbers

Posted by decipherinfosys on January 3, 2011

A reader recently asked a question about generating a list of missing numbers from a column.  They were using an auto-incremental number column and due to deletes done on the table by different transactions, they had gaps in the values in this column.  They were not interested in re-using those values but wanted to generate a quick report to see the patterns of the gaps created.

Generating such a list is pretty simple by using a number table or by using a UDF to simulate such a number table.  We had covered in our blog posts before how to go about writing such a UDF – you can read more on it here (for SQL Server) and here (for Oracle).  For this post, we will use sample code for SQL Server 2008.

USE TESTDB
GO
/*Declare a table variable and insert some records into it
Simulate the gaps by creating records with missing numbers - 4, 6, 8, 12, 13 and 14*/
DECLARE @TABLE TABLE (ID_VAL INT)
INSERT INTO @TABLE VALUES (1), (2), (3), (5), (7), (9), (10), (11), (15), (16);

Now, let’s make use of that UDF that we had shown in one of our previous posts and generate a list of running numbers.  You will see that that function takes in two input parameters – one for the number of rows that need to be returned and the other one which feeds it the starting value from which that number list starts from.  Since in this case, the reader wanted it to be starting from 1, the second parameter has a value of 1 and the first parameter i.e. total number of rows is the maximum value in the table which in our example above is 16.

So, now all that we need to do is do a left outer join between the UDF and the table variable above:

declare @max_val int
select @max_val = MAX(id_val) from @table


select x.data_value, y.id_val
from dbo.UDF_GEN_SEQUENCE (@max_val, 1) as x
left outer join @TABLE as y
on x.data_value = y.ID_VAL

Based on this, we will get all the data from the UDF which will provide us with the list of the running numbers and only the matching records from the table variable.  Now, it is as simple as adding a IS NULL check in the where clause to get only those numbers that are missing from our table variable.

select x.data_value, y.id_val
from dbo.UDF_GEN_SEQUENCE (@max_val, 1) as x
left outer join @TABLE as y
on x.data_value = y.ID_VAL
where y.ID_VAL is null

And here is the output:

data_value    id_val
4             NULL
6             NULL
8             NULL
12            NULL
13            NULL
14            NULL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: