Systems Engineering and RDBMS

HierarchyID Data Type in SQL Server 2008

Posted by decipherinfosys on December 19, 2007

This is another blog entry in the series of SQL Server 2008 posts. In this post, we will cover the basics of one more data-type: hierarchyID. This data-type, as the name suggests, can be used to store hierarchical data structures like organizational trees, file system structures or any other type of hierarchical tree structures. In SQL Server 2008, you can now store that data and very easily query it as well. In this post, we will just look at a basic example of putting in the hierarchy data and querying it. There’s a whole lot more to it including updating and re-arranging the hierarchy, re-ordering of the data, creation of indexes to help optimize the searches on hierarchical data etc.

Let’s look at this briefly using an example:

Let us create an EMP_MASTER table which will house only the Employee data and then two table variables – one to act as the source of the data for the hiearchies and the second where we recursively populate the data in a hierarchical fashion.

/*******************************************************************
Creation of the Employee Master table
********************************************************************/
CREATE TABLE EMP_MASTER
(
EMP_NBR INT NOT NULL PRIMARY KEY,
EMP_NAME NVARCHAR(50),
MGR_NBR INT NULL
)

/*******************************************************************
Populating the table with some sample data
The MGR_NBR column reflects the immediate manager of the
Employee
********************************************************************/
INSERT INTO EMP_MASTER VALUES (1, ‘DON’, 5);
INSERT INTO EMP_MASTER VALUES (2, ‘HARI’, 5);
INSERT INTO EMP_MASTER VALUES (3, ‘RAMESH’, 2);
INSERT INTO EMP_MASTER VALUES (4, ‘JOE’, 5);
INSERT INTO EMP_MASTER VALUES (5, ‘DENNIS’, NULL);
INSERT INTO EMP_MASTER VALUES (6, ‘NIMISH’, 3);
INSERT INTO EMP_MASTER VALUES (7, ‘JESSIE’, 3);
INSERT INTO EMP_MASTER VALUES (8, ‘KEN’, 5);
INSERT INTO EMP_MASTER VALUES (9, ‘AMBER’, 4);
INSERT INTO EMP_MASTER VALUES (10, ‘JIM’, 6);

/*******************************************************************
Declaration of the two table variables
@ORG_HIERARCHY will hold the actual hierarchical data after
the recursion query populates it
@HIERARCHY_TREE will hold the EMP_MASTER data with actual
numbering of the data partitioned by their manager’s number
********************************************************************/
DECLARE @ORG_HIERARCHY TABLE (HIERARCHY_NODE HIERARCHYID, EMP_NBR INT, MGR_NBR INT, EMP_NAME NVARCHAR(50));

DECLARE @HIERARCHY_TREE TABLE (EMP_NBR INT, MGR_NBR INT, NBR_REC_IN_NODE INT)
INSERT INTO @HIERARCHY_TREE
SELECT EMP_NBR, MGR_NBR, ROW_NUMBER() OVER (PARTITION BY MGR_NBR ORDER BY MGR_NBR)
FROM EMP_MASTER;

/*********************************************************************
Recursive Query to take the data and populate the @ORG_HIERARCHY
table variable. This stores the Node and the path information
which we will then query using Hiearchical query methods

First we get the root data (using the GetRoot() method)
and then we recurse through the tree and populate the
hierarchy_Node column. Note that you would need to do a cast
to a string and use the ToString() method to get the hierarchyid.
**********************************************************************/
WITH ORG_PATH (path, EMP_NBR)
AS (
SELECT hierarchyid::GetRoot() AS Node, EMP_NBR
FROM @HIERARCHY_TREE AS A
WHERE MGR_NBR IS NULL

UNION ALL

SELECT
CAST(p.path.ToString() + CAST(NBR_REC_IN_NODE AS nvarchar(50)) + ‘/’ AS hierarchyid),
A.EMP_NBR
FROM @HIERARCHY_TREE AS A
JOIN ORG_PATH AS P
ON A.MGR_NBR = P.EMP_NBR
)
insert into @ORG_HIERARCHY
select PATH, A.EMP_NBR, A.MGR_NBR, A.EMP_NAME
from EMP_MASTER AS A INNER JOIN ORG_PATH AS B ON A.EMP_NBR = B.EMP_NBR;

/*********************************************************************
Now, let’s query the data using the ToString() and GetLevel()
methods and then order by the Level field. Output is shown below
**********************************************************************/
SELECT Hierarchy_Node.ToString() AS Node, Hierarchy_Node.GetLevel() as Level, *
FROM @ORG_HIERARCHY
ORDER BY Level;

Node         Level  HIERARCHY_NODE  EMP_NBR     MGR_NBR     EMP_NAME
------------ ------ --------------- ----------- ----------- --------
/            0      0x              5           NULL        DENNIS
/1/          1      0x58            8           5           KEN
/2/          1      0x68            4           5           JOE
/3/          1      0x78            1           5           DON
/4/          1      0x84            2           5           HARI
/4/1/        2      0x8560          3           2           RAMESH
/2/1/        2      0x6AC0          9           4           AMBER
/4/1/1/      3      0x856B          6           3           NIMISH
/4/1/2/      3      0x856D          7           3           JESSIE
/4/1/1/1/    4      0x856B58        10          6           JIM

As you can see from above, the different levels are defined and the Nodes are defined as well. So, what exactly does this replace from the previous system? i.e. what are the alternate ways of doing this hierarchical structure? One can use the normalized structures of a parent/child relationship, one can even make use of the XML data-type to store un-structured data. We will do some benchmarks and will post the performance results on our blog for the different methods. In our future blog posts, we will cover the indexing schemes, the updates and re-arranging of the data in the hierarchyID data-type columns.

Sorry, the comment form is closed at this time.

 
%d bloggers like this: