Systems Engineering and RDBMS

Self Joins

Posted by decipherinfosys on August 7, 2008

One of the recent questions that was asked by a reader for her particular problem needed to include a self join. So, what is a self join? A self join is a join that is done between the same table by using aliases. Such joins are useful for displaying data when there is a relationship between different columns of the data in the same table OR when you wish to retrieve data in a single row based on comparisons between different rows in the same table. They are pretty useful in certain scenarios. We will cover some of those scenarios in this post to give you an idea on where and how you can make use of that information.

A classic example that is always given whenever one talks about self joins is that of an employee table in which there is a recursive 1:N relationship between the employee and their manager. Here is an example:

set nocount on
go
DECLARE @EMP_MASTER TABLE
(
EMP_NBR INT NOT NULL PRIMARY KEY,
EMP_NAME NVARCHAR(20),
MGR_NBR INT NULL
);

INSERT INTO @EMP_MASTER VALUES (1, ‘DON’, 5);
INSERT INTO @EMP_MASTER VALUES (2, ‘HARI’, 3);
INSERT INTO @EMP_MASTER VALUES (3, ‘RAMESH’, 5);
INSERT INTO @EMP_MASTER VALUES (4, ‘JOE’, 5);
INSERT INTO @EMP_MASTER VALUES (5, ‘DENNIS’, NULL);
INSERT INTO @EMP_MASTER VALUES (6, ‘NIMISH’, 7);
INSERT INTO @EMP_MASTER VALUES (7, ‘JESSIE’, 4);
INSERT INTO @EMP_MASTER VALUES (8, ‘KEN’, 2);
INSERT INTO @EMP_MASTER VALUES (9, ‘AMBER’, 5);
INSERT INTO @EMP_MASTER VALUES (10, ‘JIM’, 5);

select empl.EMP_NAME as employee_name, mgr.EMP_NAME as manager_name
from @EMP_MASTER as empl
inner join @EMP_MASTER as mgr
on empl.MGR_NBR = mgr.EMP_NBR

employee_name        manager_name
-------------------- --------------------
DON                  DENNIS
HARI                 RAMESH
RAMESH               DENNIS
JOE                  DENNIS
NIMISH               JESSIE
JESSIE               JOE
KEN                  HARI
AMBER                DENNIS
JIM                  DENNIS

Another example is that of a product line where you might want to build a report to see which products have the same name but differ in their packaging or aisle location where they are placed in the warehouse or a shop. Example:

declare @product table (product_name nvarchar(30), aisle nvarchar(10));
insert into @product values (‘Starbucks-Coffee’, ‘A’);
insert into @product values (‘Starbucks-Coffee’, ‘B’);
insert into @product values (‘Earl Grey’, ‘A’);

select a.product_name, a.aisle as first_aisle, b.aisle as second_aisle
from @product as a
inner join @product as b
on a.product_name = b.product_name
where a.aisle < b.aisle

product_name                   first_aisle second_aisle
------------------------------ ----------- ------------
Starbucks-Coffee               A           B

The same logic can be applied in a multitude of situations – some of the situations that come to mind (besides the ones mentioned above are):

a) Schedule log – like a train station or an airline departure/arrival running log comparisons,

b) A Web server hit stats,

c) Geological data,

d) Stock values for NASDAQ, NYSE etc.

In all these, relative comparisons based on time are typically made when one does a self join.

One Response to “Self Joins”

  1. […] by decipherinfosys on September 22, 2008 In one of our earlier posts, we had covered self joins. Yesterday, while helping a client with their data processing logic a similar situation arose which […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: