Systems Engineering and RDBMS

Enforcing uniqueness across tables

Posted by decipherinfosys on May 14, 2007

Creation of a Primary key or a unique index (constraint) allows us to enforce a uniqueness business requirement at the table level. A key thing to note over here and this is one of the very common interview questions as well is that the key difference between a Primary Key constraint and a unique constraint is that the columns involved in a primary key constraint cannot be nullable where as they can be null in a unique constraint. The behavior is a bit different in Oracle, SQL Server and DB2 LUW though. Oracle allows for multiple Null values in a unique constraint column where as SQL Server and DB2 allow for only one Null value.

So, for enforcing a uniqueness constraint on a column or a set of columns, one can easily define a unique constraint, however, at times, at client sites, we have seen a requirement to have such a business rule enforced across multiple tables.  Majority of the times this clearly signifies an issue with the schema design since if there are keys across tables, that means a de-normalized schema structure.  However, not everytime will a client be willing to make investment into fixing these issues.

In order to make this bad design work, a majority of the times we have seen that folks have taken care of this at the code level (.Net, Java, PL/SQL, T-SQL etc.) rather than enforcing something at the data tier. The application code makes validation checks and prevents the duplicates and logs the error messages in case there is code that tries to insert the dupes. However, this does not prevent the QA staff or other members of the team from creating bad data in the schema not to mention, enforcing these rules in any and all code that interacts with that schema – and this bad data could then result into spurious defects since one of the underlying business requirement of having uniqueness on columns across different tables is not met.

There are a couple of ways that this can be implemented at the data tier. One of the ways is of course the usage of triggers which has the logic encapsulated within it’s definition to check for such uniqueness and take corrective action. But another method to do this is the usage of indexed views (SQL Server lingo…is called materialized views in Oracle). One can define a view across those tables and then create a unique index on those columns. That way, whenever an application tries to insert dupes, it will get an error. Let’s follow this up with an example (SQL Server Syntax):

/*****************************************************************************************
Create two dummy tables: test_1 and test_2 and create the Fk link between them
******************************************************************************************/
create table test_1 (col1 varchar(10) not null, col2 int not null, col3 smallint not null, constraint pk_test_1 primary key (col1))
create table test_2 (col1 varchar(10) not null, col4 varchar(10) not null, col5 int not null, col6 smallint not null, constraint pk_test_2 primary key (col1, col5))
alter table test_2 add constraint fk_test_2_to_test_1 foreign key (col1) references test_1 (col1)

/***************************************************************************************************
Now, create the view on those tables and select col2 and col4 which are not part of the PK columns
And then create a unique clustered index on the view to enforce the cross table uniqueness
on the combination of those 2 columns.
****************************************************************************************************/
create view test_3
with schemabinding
as select test_1.col2, test_2.col4
from dbo.test_1
inner join dbo.test_2
on test_1.col1 = test_2.col1

create unique clustered index test_3_ind_1 on test_3 (col2, col4)

/***************************************************************************************************
Now, create some test data
****************************************************************************************************/
insert into test_1 values (‘abc’, 1, 2)
insert into test_1 values (‘suv’, 1, 2)

insert into test_2 values (‘abc’, ‘xyz’, 10, 20)

/***************************************************************************************************
Notice, how we will get the unique constraint on the view violation when we try to create a
new record that is fine as per the table defintion but does violate the view definition
****************************************************************************************************/
insert into test_2 values (‘abc’, ‘xyz’, 100, 20)

Error Received:

Msg 2601, Level 14, State 1, Line 1
Cannot insert duplicate key row in object ‘dbo.test_3′ with unique index ‘test_3_ind_1′.
The statement has been terminated.

Also, please do remember that creation of indexed views is fine for enforcing such constraints however, this also suffers an extra overhead whenever there are inserts, deletes or updates against those tables since the index has to be kept up-to-date. So, if you are required to create these rules across highly transactional tables, then those rules should be in the application tier – however, if those are to be configured for tables that are not so heavy on write operations, this method provides an easy way of enforcing data integrity as well as meeting the business requirements.

However, remember what was said before – if there is such a requirement, then you need to evaluate the schema design – more than likely there is a de-normalized schema because of which you end up with such a requirement to begin with.

About these ads

Sorry, the comment form is closed at this time.

 
Follow

Get every new post delivered to your Inbox.

Join 85 other followers

%d bloggers like this: