Systems Engineering and RDBMS

Archive for the 'DB2 LUW' Category


More on Index design

Posted by decipherinfosys on July 14, 2008

We have blogged a couple of times over the factors that should be considered when designing indexes. In one of such posts, we had also covered indexes over multiple columns. In all of those posts, you will see one common thing that we try to emphasize - do not try to come up with a “rules of thumb“.  Here is a simple example (from one of the e-mails to a question to a reader);

Question: I am trying to see how to tune/optimize a query and I saw that an index on the where condition columns is missing.  Should I just go ahead and create an index and add all the columns that are used in the where clause into that index?

The answer is of course: NO.  First, we requested the reader to go ahead and read the post from above where we have discussed the column order in covered indexes and then also gave a simple example to help illustrate the point:

Say, you have this query:

select col1, col2, col3

from dbo.big_table

where col1 >= @x and col1 <= @y

and col3 = @z

For this, we might want to create a covered index (in this order): col3, col2, col1.  This is so that we can first of all process the data based on the equality operator on col3, then scan through and narrow down the data based on the >= and <= conditions on col1 and then get the value for col2 from the index itself rather than going through the table.  Now, as we have always stated, understanding your data and it’s characteristics is very very important.  Suppose that col3 is unique in this table!  In that case, we would just create a single index on col3 and be done with it.  Why?  It will qualify for a single record and thus we will just do the row ID lookup on the table for that one single record.

So, bottom line is that please spend some time to understand the data characteristics - understand how indexes work, understand how the query is accessing the data and the data distribution - density and selectivity of the column data values as well as how often that particular index is going to be used - if an index is being created to support a report that runs only once every 6 months, is it worth to have it in the schema all year round or should we just create it prior to generating that report?  Once you have that understanding, then only decide how you want to approach the design.

Posted in DB2 LUW, Oracle, SQL Server | No Comments »

Getting the first and the last record together

Posted by decipherinfosys on June 27, 2008

While fixing a performance issue with a customer report yesterday, there was a unique requirement that came up. As the items in the warehouse move through the aisles, their datetime timestamp values get changed. In one of the sub-reports, the requirement was to display both the latest record as well as the oldest record in the system i.e. both the first and the last record based on the timestamp column. Also, another thing to add to this was that if the location for the item was not decided yet, then the those items needed to be ranked lower then those items which have a location assigned to them.

There are a couple of ways to resolve this kind of a scenario using SQL. We will present the solution in this post using T-SQL syntax though the same applies (with brief changes to Oracle and DB2 LUW as well - another thing to note also is that Oracle already has analytic functions: first_value() and last_value() that can be used with window functions to easily get these results).

Even though the actual query had a lot of table joins, for the sake of simplicity and to help explain the concept and protect the IP of the client, we will take up an example of a table variable:
set nocount on
go
declare @inventory table (item_id int not null, location_id int null, create_date_time datetime not null)
insert into @inventory values (1, null, cast(’06/27/2007 12:00:00′ as datetime))
insert into @inventory values (2, 10, cast(’06/27/2007 12:15:00′ as datetime))
insert into @inventory values (3, 20, cast(’06/26/2007′ as datetime))
insert into @inventory values (4, 30, cast(’06/21/2007′ as datetime))
insert into @inventory values (5, null, cast(’06/27/2007 1:00:00′ as datetime))
insert into @inventory values (6, 50, cast(’06/23/2007′ as datetime))
insert into @inventory values (7, 70, cast(’06/24/2007′ as datetime))
insert into @inventory values (8, 80, cast(’06/25/2007′ as datetime))
insert into @inventory values (9, 40, cast(’06/27/2007 3:00:00′ as datetime))
insert into @inventory values (10, null, cast(’06/27/2007 12:30:00′ as datetime))

select ‘***ALL Records***’, * from @inventory

                  item_id     location_id create_date_time
----------------- ----------- ----------- -----------------------
***ALL Records*** 1           NULL        2007-06-27 12:00:00.000
***ALL Records*** 2           10          2007-06-27 12:15:00.000
***ALL Records*** 3           20          2007-06-26 00:00:00.000
***ALL Records*** 4           30          2007-06-21 00:00:00.000
***ALL Records*** 5           NULL        2007-06-27 01:00:00.000
***ALL Records*** 6           50          2007-06-23 00:00:00.000
***ALL Records*** 7           70          2007-06-24 00:00:00.000
***ALL Records*** 8           80          2007-06-25 00:00:00.000
***ALL Records*** 9           40          2007-06-27 03:00:00.000
***ALL Records*** 10          NULL        2007-06-27 12:30:00.000

select top 1 ‘***LAST RECORD***’ as Last_Record, * from @inventory as I order by
case when location_id is null then 1 else 0 end ASC
,create_date_time desc

Last_Record       item_id     location_id create_date_time
----------------- ----------- ----------- -----------------------
***LAST RECORD*** 2           10          2007-06-27 12:15:00.000

select top 1 ‘***FIRST RECORD***’ as First_Record, * from @inventory as I order by
case when location_id is null then 1 else 0 end ASC
,create_date_time asc

First_Record       item_id     location_id create_date_time
------------------ ----------- ----------- -----------------------
***FIRST RECORD*** 4           30          2007-06-21 00:00:00.000

If you look at the output from above, you can see that the last record that is picked up by the logic is Item #2 even though Item #10 has the latest create_date_time timestamp value. The reason for this is that we had the requirement that if an item has not been assigned a location yet, that should rank lower. So, the above 2 SQL statements are just to show how we can get the first and the last records in a SQL statement. Now, in the report, we needed to get these two together and as you know, if one uses the UNION or UNION ALL clause between two statements, one can use only one ORDER BY clause which applies to the whole set. So, it is not as straight forward as just doing a UNION/UNION ALL operation between the two sets and applying an ORDER BY.

One of the solutions is to use derived tables:

select LR.* from
(select top 1 ‘***LAST RECORD***’ as Last_Record, * from @inventory as I order by
case when location_id is null then 1 else 0 end ASC
,create_date_time desc) as LR
UNION ALL
select FR.* from
(select top 1 ‘***FIRST RECORD***’ as First_Record, * from @inventory as I order by
case when location_id is null then 1 else 0 end ASC
,create_date_time asc) as FR

Last_Record        item_id     location_id create_date_time
------------------ ----------- ----------- -----------------------
***LAST RECORD***  2           10          2007-06-27 12:15:00.000
***FIRST RECORD*** 4           30          2007-06-21 00:00:00.000

Another solution is to use CTE (Common Table Expressions):

with results as
(select top 1 ‘***LAST RECORD***’ as Last_Record, * from @inventory as I order by
case when location_id is null then 1 else 0 end ASC
,create_date_time desc
UNION ALL
select top 1 ‘***FIRST RECORD***’ as First_Record, * from @inventory as I order by
case when location_id is null then 1 else 0 end ASC
,create_date_time asc
)
select * from results;

Last_Record        item_id     location_id create_date_time
------------------ ----------- ----------- -----------------------
***LAST RECORD***  2           10          2007-06-27 12:15:00.000
***FIRST RECORD*** 4           30          2007-06-21 00:00:00.000

Yet another solution can be the usage of the row_number() analytic function, for example:

select item_id, location_id, create_date_time
from
(
select
*
, row_number() over (order by case when location_id is null then 1 else 0 end ASC
,create_date_time desc) as RN_1
, row_number() over (order by case when location_id is null then 1 else 0 end ASC
,create_date_time asc) as RN_2
from @inventory) as IV
where RN_1 = 1 or RN_2 = 1

item_id     location_id create_date_time
----------- ----------- -----------------------
4           30          2007-06-21 00:00:00.000
2           10          2007-06-27 12:15:00.000

There are other solutions as well to this problem. One needs to test these out in one’s scenario and make sure that you have meaningful filter criterias and good execution plans. The next challenge in this report was that the project manager wanted these records listed side by side on the same record. That is pretty simple to do as well - using either pivoting or even simple join between derived tables with a match on the row_number() analytic function value. We leave that as an exercise to the reader and if you have questions, we will be more than happy to post the answer to that as well.

Posted in DB2 LUW, Oracle, SQL Server | No Comments »

Common Problem - getting the first in a Group

Posted by decipherinfosys on June 18, 2008

This is one of the very common problems in the SQL world and also one that has many different solutions available. The problem description is this:

You have a table with say this structure (using SQL Server Syntax):

set nocount on
go
declare @test table (store_nbr int, product_name nvarchar(20), product_date datetime, aisle_nbr int)
insert into @test values (1, ‘A’, ‘8-12-2004′, 6)
insert into @test values (1, ‘B’, ‘7-10-2005′, 2)
insert into @test values (1, ‘C’, ‘6-11-2006′, 3)
insert into @test values (1, ‘D’, ‘11-10-2007′, 4)
insert into @test values (2, ‘E’, ‘10-12-2004′, 2)
insert into @test values (2, ‘F’, ‘5-9-2005′, 4)

And what you want to find out is that record in each store which was introduced the last so essentially in each group (i.e. each store), you want to find out the product that was introduced most recently. There are a lot of ways to solve this problem and the solutions vary among RDBMS based on syntax only. One can use analytic functions to resolve this, use a derived table or use a co-related sub-query. Here is one way to solve this issue:

select * from @test A
where A.product_date = (select top 1 B.product_date
from @test as B
where A.store_nbr = B.store_nbr
order by product_date desc)

store_nbr   product_name         product_date            aisle_nbr
----------- -------------------- ----------------------- -----------
1           D                    2007-11-10 00:00:00.000 4
2           F                    2005-05-09 00:00:00.000 4

In this code, we are sorting based on the product date in a descending order in the co-related sub-query and getting the equality match based on the TOP 1 record. We could have also used the ROW_NUMBER() function as well, example:

select store_nbr, product_name, product_date, aisle_nbr
from
(select ROW_NUMBER() over (partition by store_nbr order by product_date desc) as rn, *
from @test A
) as IV
where IV.rn = 1

One can also choose to do a MAX operation in the co-related sub-query, example:

select * from @test as A
where A.product_date = (select MAX(B.product_date)
from @test as B
where A.store_nbr = B.store_nbr)

And in SQL Server, one can also use the CROSS APPLY function in SQL Server 2005.  We will look at the execution plans and the performance characteristics of these different options in an upcoming post.

Posted in DB2 LUW, Oracle, SQL Server | No Comments »

Data Migration Scenario - CTE solution

Posted by decipherinfosys on June 12, 2008

A few days ago, we had posted the problem and the solution pertaining to a data migration issue. We had given some solutions and explained one of those in detail along with the code. A reader asked us to do the same thing using the CTE which was one of the solutions that we had proposed. Using a CTE, there are a couple of ways to get it done. Refer to the tables and the data in that post before you use this code. Here is the CTE code and the explanation follows after that:

with
N1 (n) AS (SELECT 1 UNION ALL SELECT 1),
N2 (n) AS (SELECT 1 FROM N1 AS X, N1 AS Y),
N3 (n) AS (SELECT 1 FROM N2 AS X, N2 AS Y),
N4 (n) AS (SELECT 1 FROM N3 AS X, N3 AS Y),
N5 (n) AS (SELECT ROW_NUMBER() OVER(ORDER BY X.n%10) FROM N4 AS X, N4 AS Y)
select p.ID as ItemID,
row_number() over (partition by p.id order by n) as LineItem,
substring(p.profession, case when n = 1 then 1 else n + 1 end, 10) as Profession
from N5
join dbo.profession as p
on N5.n <= Len(p.Profession)
where n <= 100
and (n = 1 or n%10 = 0)
order by p.id, n

ItemID      LineItem             Profession
----------- -------------------- ----------
1           1                    I am a Pro
1           2                    grammer an
1           3                    d a baseba
1           4                    ll Champio
1           5                    n and a fo
1           6                    otball Pla
1           7                    yer
2           1                    I am a Doc
2           2                    tor
3           1                    I am an Ar
3           2                    tist

The above piece of code first constructs a number table using the CTE (this was introduced by Itzik in his SQL column in the sqlmag) and then it joins with the source table Profession and slices the string up and shows up the data in the format that we needed for the destination table.

Posted in DB2 LUW, Oracle, SQL Server | 1 Comment »

Data Migration Scenario

Posted by decipherinfosys on June 9, 2008

Over the weekend, a colleague had asked a question regarding one of the data migration issues that he was facing in one of the projects. We will mention the problem as well as the solution in this blog post.

Problem: He had a source table with the following structure and data (ID was an integer column with Identity attribute and Description was a string column with a max length of 100):

SOURCE TABLE

ID DESCRIPTION
1 I am a programmer
2 I am a doctor

In the destination database, he had a destination table with the same columns except that the Description column was only 10 characters long. His task was to load the records from the source to the destination table. If the length of the description was more than 10 characters, he was required to insert another row with the next 10 characters into the Destination table with a LineId of next incrementing number as follows.

Destination Table

ID LINEID DESCRIPTION
1 1 I am a pro
1 2 grammer
2 1 i am a doc
2 2 tor

Solution: There are many ways to solve this problem. Since he was using plain simple T-SQL and not SSIS package for the flow, we presented 4 solutions to the problem - before we present them in the blog, a quick glance at the example that he had given indicates that the problem is essentially the same as un-pivoting the data and transposing the column into rows. Since the maximum length of the source column is known before hand - nvarchar(100) … and the destination’s string max length is nvarchar(10), we can at the most get 10 lineID values for each ID record from the source. The problem would have been more complex if we were getting the data from a source where the max length was not known (we have seen such requirements as well) and in that case dynamic un-pivoting needs to be done. So, with that in mind, here are the solutions:

1) Use static union all lists since you already know the source column length (100) that goes into a maximum length of 10…so, it translates to 10 different SQLs with UNION ALL operations between them.

Or…

2) Use a CTE (Common Table Expression) to do recursive operation to break the source description column into separate line items.

Or…

3) Since the maximum length of the source is known, the issue translates to being able to unpivot the data i.e. treat that one single column a the source and un-pivot the column into rows (look at our whitepaper on pivot and unpivot).

Or…

4) The old way of using a cursor (or a while loop), looping around and creating the new records.

Let’s create the dummy tables and some data and then we will use #1 method to show how we can do this in a simple SQL statement (using SQL Server Syntax):

/********************************************************************************
create some dummy tables
Profession is the Source table and ProfessionItem is the destination table
*********************************************************************************/
create table Profession (Id int identity(1,1), Profession nvarchar(100))
create table ProfessionItem (ItemId int, LineItem int,Profession nvarchar(10))

/********************************************************************************
insert some dummy data
*********************************************************************************/
insert into Profession (Profession)
values (N’I am a Programmer and a baseball Champion and a football Player’);

insert into Profession (Profession)
values (N’I am a Doctor’);

insert into Profession (Profession)
values (N’I am an Artist’);

Now, a simple SQL statement using the method from #1 option above will give us the data in the required format:

Select *
from
(
SELECT P.Id as ItemID, 1 as LineItem, SUBSTRING(Profession, 1, 10) as Profession
FROM dbo.Profession as P
union all
SELECT P.Id as ItemID, 2 as LineItem, SUBSTRING(Profession, 11, 10) as Profession
FROM dbo.Profession as P
union all
SELECT P.Id as ItemID, 3 as LineItem, SUBSTRING(Profession, 21, 10) as Profession
FROM dbo.Profession as P
union all
SELECT P.Id as ItemID, 4 as LineItem, SUBSTRING(Profession, 31, 10) as Profession
FROM dbo.Profession as P
union all
SELECT P.Id as ItemID, 5 as LineItem, SUBSTRING(Profession, 41, 10) as Profession
FROM dbo.Profession as P
union all
SELECT P.Id as ItemID, 6 as LineItem, SUBSTRING(Profession, 51, 10) as Profession
FROM dbo.Profession as P
union all
SELECT P.Id as ItemID, 7 as LineItem, SUBSTRING(Profession, 61, 10) as Profession
FROM dbo.Profession as P
union all
SELECT P.Id as ItemID, 8 as LineItem, SUBSTRING(Profession, 71, 10) as Profession
FROM dbo.Profession as P
union all
SELECT P.Id as ItemID, 9 as LineItem, SUBSTRING(Profession, 81, 10) as Profession
FROM dbo.Profession as P
union all
SELECT P.Id as ItemID, 10 as LineItem, SUBSTRING(Profession, 91, 10) as Profession
FROM dbo.Profession as P
) AS X
Where LEN(X.Profession) > 0
Order by X.ItemID, X.LineItem

ItemID      LineItem    Profession
----------- ----------- ----------
1           1           I am a Pro
1           2           grammer an
1           3           d a baseba
1           4           ll Champio
1           5           n and a fo
1           6           otball Pla
1           7           yer
2           1           I am a Doc
2           2           tor
3           1           I am an Ar
3           2           tist

A point to note is that if the source table is huge, then we can break it up in sets of 10000 based on the ID values and use that criteria in a while loop and put the ID fitler criteria in the WHERE clause of the SQL statement shown above. Since ID will be an indexed column (clustered index), the range scans will be fine and we can quickly get the data from the source to the destination table using this method.

Posted in DB2 LUW, Oracle, SQL Server | 1 Comment »

Extract Day, Month or Year from a Date Field

Posted by decipherinfosys on May 21, 2008

One of the developers at a client site was writing up a report for the yearly sales. Invoice data is stored as per date when the invoice was issued. He needed to find out a way to extract year from the date field and sum up the sales amount on yearly basis. There are many ways of doing it … in this post, we will show one of the most common and easy ways of achieving this.

Oracle uses an EXTRACT(datetime) function to extract the value of a specified datetime field from a datetime expression. Function is very handy to manipulate specific datetime field value like day, month or year. Following values can be extracted from the column having date datatype. Apart from values mentioned below, timezone related values like timezone_region, timezone_hour and timezone_minute can also be obtained using EXTRACT function if underlying datatype is TIMESTAMP WITH TIME ZONE.

* Day, Month, Year, Hour, Minute, Second

Create following table and populate it with the data. If you already have table with the same name, change all the occurrences of the table name with some other name.

CREATE TABLE Invoice
(

INVOICE_NUMBER      NUMBER(9) NOT NULL,
INVOICE_DATE          DATE NOT NULL,
CLIENT_ID                NUMBER(9)     NOT NULL,
INVOICE_AMT           NUMBER(9,2) DEFAULT 0 NOT NULL,
PAID_FLAG               NUMBER(1) DEFAULT 0 NOT NULL, — 0 Not paid/ 1 paid
CONSTRAINT PK_INVOICE PRIMARY KEY(INVOICE_NUMBER)
)
– TABLESPACE Clause
/

CREATE SEQUENCE INVOICE_SEQ
START WITH 1
CACHE 100
/

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate-5,101,1100.00);

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate-10,102,1100.00);

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate-20,103,1100.00);

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate-40,101,1100.00);

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate-38,101,1500.00);

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate-60,102,1100.00);

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate-65,103,1100.00);

Run following SQL from SQL*Plus prompt. It Extracts Year from the Invoice_Date and group it by year to see yearly sales.

SQL> SELECT EXTRACT(Year FROM Invoice_date) as Year,
SUM(Invoice_Amt) as Sales_Amt
FROM Invoice
GROUP BY EXTRACT(Year FROM Invoice_Date);

YEAR  SALES_AMT
—- ———-
2006       4800
2007       3300

MS SQL Server has datepart functions. Using datepart we can get following datetime fields.

*  Day , Dayofyear, Week , Weekday, Month, Quarter, Year
* Hour, Minute, Second, Millisecond

It also has DAY, MONTH and YEAR functions which are equivalent to datepart(dd,date), datepart(mm,date) and datepart(yy,date) respectively. Let us create the table and see how we can do it.

CREATE TABLE dbo.Invoice
(
INVOICE_NUMBER      INT IDENTITY(1,1) NOT NULL,
INVOICE_DATE        DATETIME NOT NULL,
CLIENT_ID           INT    NOT NULL,
INVOICE_AMT         NUMERIC(9,2) DEFAULT 0 NOT NULL,
PAID_FLAG           TINYINT DEFAULT 0 NOT NULL, — 0 Not paid/ 1 paid
CONSTRAINT PK_INVOICE PRIMARY KEY(INVOICE_NUMBER)
)
– FILEGROUP Clause
GO

INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate()-5,101,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate()-10,102,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate()-20,103,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate()-40,101,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate()-38,101,1500.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate()-60,102,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate()-65,103,1100.00);

We can use either Year function or Datepart syntax to get the final results. Both SQLs are shown under. You can either run it from Query Analyzer or Management studio.

SELECT YEAR(Invoice_date) as Year, SUM(Invoice_Amt) as Sales_Amt
FROM Invoice
GROUP BY Year(Invoice_Date)
GO
OR
SELECT DATEPART(YY,Invoice_date) as Year, SUM(Invoice_Amt) as Sales_Amt
FROM Invoice
GROUP BY DATEPART(YY,Invoice_Date)
GO

DB2 LUW has date and timestamp datatype for storing dates. Given a date, time or timestamp value, we can extract following datetime fields.

* Day,  Month, Year (date and timestamp)
*  Hour, Minute, Second, MicroSecond (time and timestamp)

Apart from above functions, it also returns week, quarter, dayofweek  and dayofyear given a date or timestamp.

CREATE TABLE Invoice
(
INVOICE_NUMBER      INTEGER NOT NULL GENERATED BY DEFAULT AS
IDENTITY (START WITH +1, INCREMENT BY +1, CACHE 1000),
INVOICE_DATE        TIMESTAMP NOT NULL,
CLIENT_ID           INTEGER       NOT NULL,
INVOICE_AMT         DECIMAL(9,2) DEFAULT 0 NOT NULL,
PAID_FLAG           SMALLINT DEFAULT 0 NOT NULL,
CONSTRAINT PK_INVOICE PRIMARY KEY(INVOICE_NUMBER)
);

INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp - 5 days,101,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp - 10 days,102,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp - 20 days,103,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp - 40 days,101,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp - 38 days,101,1500.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp - 60 days,102,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp - 65 days,103,1100.00);

Following is the SQL to obtain yearly sales in DB2. You can run it from Command Editor.

SELECT YEAR(Invoice_date) as Year, SUM(Invoice_Amt) as Sales_Amt
FROM Invoice
GROUP BY Year(Invoice_Date);

Posted in DB2 LUW, Oracle, SQL Server | No Comments »

Returning a record set from a stored procedure

Posted by decipherinfosys on May 19, 2008

One of the the database developers at a client site who is responsible for writing DB code across the three RDBMS that they support wanted a simple example of a stored procedure which returns the record set to the calling application for further processing.  In this post, we will show you in brief example, how can we address this in Oracle, MS SQL Server and DB2 LUW.

Oracle:
First let us create a table and populate it with small set of data. If you already have table with the same name, change all the occurrences of the table name with some other name.

CREATE TABLE Invoice
(
INVOICE_NUMBER      NUMBER(9) NOT NULL,
INVOICE_DATE        DATE NOT NULL,
CLIENT_ID           NUMBER(9)     NOT NULL,
INVOICE_AMT         NUMBER(9,2) DEFAULT 0 NOT NULL,
PAID_FLAG           NUMBER(1) DEFAULT 0 NOT NULL, — 0 Not paid/ 1 paid
CONSTRAINT PK_INVOICE PRIMARY KEY(INVOICE_NUMBER)
)
– TABLESPACE Clause
/

CREATE SEQUENCE INVOICE_SEQ
START WITH 1
CACHE 100
/

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate,101,1100.00);

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate,102,1100.00);

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate,103,1100.00);

INSERT INTO INVOICE(Invoice_Number, Invoice_date, client_ID, Invoice_Amt)
VALUES(INVOICE_SEQ.NEXTVAL, sysdate,104,1100.00);

Oracle uses cursor variables to pass query result sets between PL/SQL sub programs and to the client application. A cursor variable has data type REF CURSOR and that is what is normally known as a ref cursor (we will cover more on the ref cursors in one of our future whitepapers). Following is the code snippet to return result set from the stored procedure.

To create a cursor variable first we need to define a REF CURSOR type and then declare the cursor variable of that type.  To avoid declaring the same REF CURSOR type in each program, we will create one global type in a package specification and declare cursor of that type in our procedure.

Run following scripts to create package and procedure from SQL*Plus prompt. Testing script is to test the execution of the procedure.

/* Create package */
CREATE OR REPLACE PACKAGE types
AS
type cursorType is ref cursor;
END;
/

– Here we have declared cursor variable of type cursorType as an output variable.
CREATE OR REPLACE PROCEDURE DEC_RTN_RECORDSET
(
p_InvoiceDate      IN   DATE,
p_ResultSet        OUT  TYPES.cursorType
)
AS
BEGIN
OPEN p_ResultSet FOR
SELECT Invoice_Number, Invoice_Date, Client_ID, Invoice_Amt
FROM Invoice
WHERE Invoice_date <= p_InvoiceDate
ORDER BY Invoice_number;

END DEC_RTN_RECORDSET;
/

– Testing

VARIABLE resultSet  REFCURSOR
EXEC DEC_RTN_RECORDSET(sysdate, :resultSet);
PRINT :resultSet

Following is the output. It is formatted for more readability.

Invoice_Number       Invoice_Date  Client_ID     Invoice_Amt
————–       ————  ———     ———–
1       25-JAN-07           101            1100
2       25-JAN-07           102            1100
3       25-JAN-07           103            1100
4
25-JAN-07           104            1100

MS SQL Server does not require any declaration of variable to return record set back to calling program from the stored procedure. By simply adding a SELECT statement at the end of the procedure for qualified columns, procedure will return data back to calling program since SQL Server handles all this internally which is indeed very convenient for programming purposes.  Following is the code snippet of MS SQL Server stored procedure. It does not require specific open cursor command.

CREATE TABLE dbo.Invoice
(
INVOICE_NUMBER      INT IDENTITY(1,1) NOT NULL,
INVOICE_DATE        DATETIME NOT NULL,
CLIENT_ID           INT    NOT NULL,
INVOICE_AMT         NUMERIC(9,2) DEFAULT 0 NOT NULL,
PAID_FLAG           TINYINT DEFAULT 0 NOT NULL, — 0 Not paid/ 1 paid
CONSTRAINT PK_INVOICE PRIMARY KEY(INVOICE_NUMBER)
)
– FILEGROUP Clause
GO

INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate(),101,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate(),102,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate(),103,1100.00);
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(getDate(),104,1100.00);

Create following stored procedure either from Query analyzer or management studio.

CREATE PROC DEC_RTN_RECORDSET
(

@InvoiceDate DATETIME

)

AS
BEGIN
SET NOCOUNT ON
–Get the data from the Invoice table to return it to client application
SELECT Invoice_Number, Invoice_Date, Client_ID, Invoice_Amt
FROM Invoice
WHERE Invoice_date <= @InvoiceDate
ORDER BY Invoice_number;

SET NOCOUNT OFF
END
GO

– Run followig command to test the execution of the procedure.

DECLARE @CurrentDate datetime
set @CurrentDate = GETDATE()
exec DEC_RTN_RECORDSET @CurrentDate

Following is the output. It is formatted for more readability.

Invoice_Number       Invoice_Date               Client_ID     Invoice_Amt
————–       ————               ———     ———–
1       2007-01-25 15:21:22.300          101             1100
2       2007-01-25 15:21:22.300          102             1100
3       2007-01-25 15:21:22.300          103             1100
4       2007-01-25 15:21:22.300          104             1100

DB2 LUW also does not require any variable declaration to pass back to the calling program. But we need to DECLARE the cursor first and then OPEN the cursor. When we declare a cursor we have to use WITH RETURN TO clause in order to return record set from the procedure. If result set is to be returned to invoker of the procedure then we need to use WITH RETURN TO CALLER in declaration. If result set is to be returned to originating application then we need to use WITH RETURN TO CLIENT in declaration. Here we are assuming that we are returning result set to originating application and hence we are using WITH RETURN TO CLIENT.

Let us create table first and populate it with data. You can create it either via command editor or using command window. Please make sure that statement terminator is “@” or else you may receive an error (or replace with whatever statement terminator you have set in your programming tool).

CREATE TABLE Invoice
(
INVOICE_NUMBER      INTEGER NOT NULL GENERATED BY DEFAULT AS
IDENTITY (START WITH +1, INCREMENT BY +1, CACHE 1000),
INVOICE_DATE        TIMESTAMP NOT NULL,
CLIENT_ID           INTEGER       NOT NULL,
INVOICE_AMT         DECIMAL(9,2) DEFAULT 0 NOT NULL,
PAID_FLAG           SMALLINT DEFAULT 0 NOT NULL,
CONSTRAINT PK_INVOICE PRIMARY KEY(INVOICE_NUMBER)
)
@

INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp,101,1100.00)@
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp,102,1100.00)@
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp,103,1100.00)@
INSERT INTO INVOICE(Invoice_date, client_ID, Invoice_Amt) VALUES(current timestamp,104,1100.00)@

Following is the procedure to return the result set. You can run it from Command Editor or save it in a file and run it from command window using following command.

db2 �td@ -f <filename> (filename in which following code is stored).

CREATE PROCEDURE DEC_RTN_RECORDSET
(
IN  P_InvoiceDate   TIMESTAMP
)
DYNAMIC RESULT SETS 1
LANGUAGE SQL
BEGIN
–Get the data from the table to display it back to client application
BEGIN
DECLARE c2 CURSOR WITH RETURN TO CLIENT FOR
SELECT Invoice_Number, Invoice_Date, Client_ID, Invoice_Amt
FROM Invoice
WHERE Invoice_date <= p_InvoiceDate
ORDER BY Invoice_number;

OPEN c2;
END;
END
@

– To test the procedure execute following command.
call DEC_RTN_RECORDSET(current timestamp)@

Following is the output. It is formatted for more readability.

Invoice_Number       Invoice_Date               Client_ID     Invoice_Amt
————–       ————               ———     ———–
1       2007-01-25-15.45.33.991000       101             1100
2       2007-01-25-15.45.34.054000       102             1100
3       2007-01-25-15.45.34.132000       103             1100
4       2007-01-25-15.45.34.210000       104             1100

Posted in DB2 LUW, Oracle, SQL Server | No Comments »

Traversing the hierarchy - Part II

Posted by decipherinfosys on April 26, 2008

Yesterday, we had covered how to traverse a hierarchy using CTEs. We had used the schema sample from a pharmaceutical industry as an example. In today’s post, let’s make use of a more generic example. This one pertains to a recursive 1:N relationship within the same table - and this scenario exists in almost all of the HR applications out there. An employee has a manager and the top dog in the company does not have anyone to report to (well, they do have to report to the board and the shareholders of the company but they do not have those set up as employees in the day to day activities of the organization :-)).

So, let’s take a look at the table and the data that we will use to illustrate this: The hierarchy relationship example can also be applied to the employees of a company. Let’s say that the table name is EMP_MASTER with this structure:

CREATE TABLE EMP_MASTER
(
EMP_NBR INT NOT NULL PRIMARY KEY,
EMP_NAME VARCHAR(20) NOT NULL,
MGR_NBR INT NULL
)
GO

/******************************************
Now, let us insert some data into this
This uses the SQL Server syntax
*******************************************/

SET NOCOUNT ON

BEGIN TRY
BEGIN TRANSACTION
INSERT INTO EMP_MASTER VALUES (1, ‘DON’, 5);
INSERT INTO EMP_MASTER VALUES (2, ‘HARI’, 1);
INSERT INTO EMP_MASTER VALUES (3, ‘RAMESH’, 2);
INSERT INTO EMP_MASTER VALUES (4, ‘JOE’, 2);
INSERT INTO EMP_MASTER VALUES (5, ‘DENNIS’, NULL);
INSERT INTO EMP_MASTER VALUES (6, ‘NIMISH’, 5);
INSERT INTO EMP_MASTER VALUES (7, ‘JESSIE’, 6);
INSERT INTO EMP_MASTER VALUES (8, ‘KEN’, 7);
INSERT INTO EMP_MASTER VALUES (9, ‘AMBER’, 8);
INSERT INTO EMP_MASTER VALUES (10, ‘JIM’, 6);
INSERT INTO EMP_MASTER VALUES (11, ‘JOHN’, 5);
INSERT INTO EMP_MASTER VALUES (12, ‘LUTHER’, 11);
INSERT INTO EMP_MASTER VALUES (13, ‘GARY’, 11);
INSERT INTO EMP_MASTER VALUES (14, ‘ROBERTO’, 12);
INSERT INTO EMP_MASTER VALUES (15, ‘IRA’, 13);
COMMIT TRANSACTION
END TRY
BEGIN CATCH
IF @@TRANCOUNT > 0
ROLLBACK TRANSACTION
SELECT
ERROR_NUMBER() AS ErrorNumber,
ERROR_SEVERITY() AS ErrorSeverity,
ERROR_STATE() AS ErrorState,
ERROR_PROCEDURE() AS ErrorProcedure,
ERROR_LINE() AS ErrorLine,
ERROR_MESSAGE() AS ErrorMessage;
END CATCH
GO

/******************************************
Let’s do a simple select to look
at the data
*******************************************/

SELECT * FROM dbo.EMP_MASTER

EMP_NBR     EMP_NAME             MGR_NBR

----------- -------------------- -----------

1           DON                  5

2           HARI                 1

3           RAMESH               2

4           JOE                  2

5           DENNIS               NULL

6           NIMISH               5

7           JESSIE               6

8           KEN                  7

9           AMBER                8

10          JIM                  6

11          JOHN                 5

12          LUTHER               11

13          GARY                 11

14          ROBERTO              12

15          IRA                  13

/******************************************
Let’s create a view which will use a
CTE in order to traverse through
the hierarchy
*******************************************/

IF EXISTS
(
SELECT 1
FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_NAME = ‘emp_hierarchy’
AND TABLE_SCHEMA = ‘dbo’
)
BEGIN
PRINT ‘View emp_hierarchy already exists…Dropping it and recreating’
DROP VIEW emp_hierarchy
END
ELSE
BEGIN
PRINT ‘Creating View emp_hierarchy’
END
GO
create view dbo.emp_hierarchy
as
with emp (traverse_path, emp_id, mgr_id, emp_name, level, lineage)
as (
select CONVERT(VARBINARY(MAX), emp_name) AS traverse_path,
emp_nbr,
Null as mgr_id,
emp_name,
1,
cast((’/’ + str(emp_nbr,6,0) + ‘/’) as varchar(255))
from dbo.emp_master
where mgr_nbr is null
/*– the highest level in the organization*/

union all

select e.traverse_path + CONVERT(VARBINARY(MAX), em.emp_name) as traverse_path,
emp_nbr,
mgr_nbr,
em.emp_name,
e.level + 1,
cast((e.lineage + str(em.emp_nbr,6,0) + ‘/’) as varchar(255))
from dbo.emp_master as em
inner join emp e
on e.emp_id = em.mgr_nbr
)
select * from emp;
GO

If you look at the above view, it is only slightly different than the example that we had shown yesterday. The reason for the difference is that in yesterday’s example, we were handling a M:N recusrive relationship between the same entity and thus there were two tables to deal with while in today’s example, we are dealing with a single table and a 1:N recursive relationship. The logic remains exactly the same as before. We get the top dog in the company as the first record and then we do a UNION ALL operation to the recursive tree and use the traverse path as an ordering mechanism so that we can see the actual tree. The trick is to cast the employee name as a VARBINARY(MAX) and keep on appending it so that we can later on order by on it to see the complete tree structure for the organization. That helps us to see from the CEO - down to the SVP and their individual department immediately down from there till the deepest level is reached. And that is then followed by the next SVP and their hierarchy in the same fashion. And this continues that way to show a complete tree representation. The lineage column is used to just show the lineage of the tree.

You can see it by executing the following command:

select * from emp_hierarchy order by traverse_path
GO

Since the output is rather large, we will leave it up to you to execute these in your environment to see it. This is a quick and fast way to recursively build up hierarchy trees in SQL Server 2005 or Oracle and DB2 (with slight syntax differences). SQL Server 2008 also introduces a new data-type for hierarchy which we will cover in a later post.

Posted in DB2 LUW, Oracle, SQL Server | No Comments »

Traversing the hierarchy - Part I

Posted by decipherinfosys on April 25, 2008

Traversing the hierarchy in the user data is a common requirement in most of the software products. You must have faced this requirement when you are:

a) building up a tree for navigation, or
b) whether you are recursively trying to get the employee hierarchy in a HR application module, or
c) whether you are establishing the tree structure for drugs in pharmaceutical inudstry, or
d) you are traversing the hierarchy of the different parts in a manufacturing industry.

Above are just some of the examples. There are many such places where one is required to write code to help with traversing the hierarchy from the top level down to the bottom. If you look at our previous post which explains how to design a M:N recursive relationship within a schema - that example pertains to the pharmaceutical industry. It shows how to model a recursive M:N relationship within an entity into physical tables. We will use that as an example to write up T-SQL (SQL Server) code to show how to traverse through that hierarchy. We will make use of CTEs (Common Table Expressions) which are also present in Oracle and DB2 LUW (with slight differences). In Oracle, one can also make use of the CONNECT BY clause in order to traverse the hierarchy - we will cover that in a later post.

Let us create the view which will have the CTE and the logic for recursion:

IF EXISTS
(
SELECT 1
FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_NAME = ‘drug_hierarchy’
AND TABLE_SCHEMA = ‘dbo’
)
BEGIN
PRINT ‘View drug_hierarchy already exists…Dropping it and recreating’
DROP VIEW drug_hierarchy
END
ELSE
BEGIN
PRINT ‘Creating View drug_hierarchy’
END
GO
create view dbo.drug_hierarchy
as
with drug (traverse_path, drug_id, parent_drug_id, drug_name, level, lineage)
as (
select CONVERT(VARBINARY(MAX), drug_name) AS traverse_path,
drug_master_id,
Null as parent_drug_id,
drug_name,
1,
cast((’/’ + str(drug_master_id,6,0) + ‘/’) as varchar(255))
from dbo.drug_master
where drug_master_id not in (select child_drug_id from dbo.drug_rltn)
/* the highest level in the tree*/

union all

select d.traverse_path + CONVERT(VARBINARY(MAX), dm2.drug_name) as traverse_path,
dr.child_drug_id,
dr.parent_drug_id,
dm2.drug_name,
d.level + 1,
cast((d.lineage + str(dm.drug_master_id,6,0) + ‘/’) as varchar(255))
from dbo.drug_master as dm
inner join dbo.drug_rltn as dr
on dm.drug_master_id = dr.parent_drug_id
inner join dbo.drug_master as dm2
on dr.child_drug_id = dm2.drug_master_id
inner join drug as d
on d.drug_id = dm.drug_master_id
)
select * from drug;
GO

The logic of the view is that we first get the top record in the DRUG_MASTER table as the first record (this one is not a child of any drug - if such a record does not exist, just create a dummy one in your table) and then we do a UNION ALL operation to the recursive tree and use the traverse path as an ordering mechanism so that we can see the actual tree. The trick is to cast the drug name as a VARBINARY(MAX) and keep on appending it so that we can later on order by on it to see the complete tree structure for the drug hierarchy. That helps us to see from the top drug - down to the next one in alphabetical order and their individual drug components immediately down from there till the deepest level is reached. And that is then followed by the next drug under the parent drug and their hierarchy in the same fashion. And this continues that way to show a complete tree representation. The lineage column is used to just show the lineage of the tree.

You can see it by executing the following command:

select * from drug_hierarchy order by traverse_path
GO

Since the output is rather large, we will leave it up to you to execute these in your environment to see it. This is a quick and fast way to recursively build up hierarchy trees in SQL Server 2005 or Oracle and DB2 (with slight syntax differences). SQL Server 2008 also introduces a new data-type for hierarchy which we will cover in a later post.

Tomorrow, we will cover the second part of this post which will go into how to recursively traverse a tree hierarchy by using a 1:N recursive relationship in a table like the Employee structure in an organization.

Posted in DB2 LUW, Oracle, SQL Server | 1 Comment »

Myths or so called “Rules of Thumb”

Posted by decipherinfosys on April 19, 2008

Those who have worked for some time in the IT industry know that there are many exceptions to the so called “Rules of Thumb”. That is why in many scenarios, if you ask an IT expert a question, the answer will be: It Depends :-) and no, the person is not trying to beat around the bush, he/she is actually trying to explain how the different situations effect that scenario and why the answer might be a different one depending upon a particular client situation. Yesterday, while having a discussion with one of the senior resources, he suggested that we should write something about such “Rules of Thumb” that exist in the DB world. We will list some of them in this post and then in subsequent posts, pick them one by one and show examples where the rules of thumb get violated. In addition, there are a lot of Myths out there which even senior consultants seem to propagate at client sites…here are some of them:

1) FTS (Full Table Scans) are always bad and Index usage is always good.

2) Usage of dynamic SQL within the stored procedure code is always bad even for search procedures.

3) Empty Space in an index that gets created due to the DML operations do not get used.

4) Indexes should be rebuilt at regular intervals.

5) Indexes and statistics are the same thing. Also, histograms are needed only on indexed columns.

6) Usage of cursors is always bad so avoid them like the plague.

7) Truncate command cannot be rolled back because it is a non-logged operation.

8 ) Table variables in SQL Server are always only memory resident.

9) Column order in a covered index does not matter.

10) In the case of SQL Server, one can separate the clustered index from the table.

11) Only committed data gets written to the disk.

12) Logical IOs (LIO) are not a cause of concern, only Physical IO (PIO) are.

13) Count(1) is better performing than count(*).

14) Issue frequent commits in the application to make the transaction faster and also improve concurrency.

15) Views are evil evil DB Objects that always slow down performance.

There are many many more myths that are out there. We just put down some of them over here and will start writing about each one in subsequent blogs giving examples and code samples to show the exceptions to the so called “Rules of Thumb” and the myths. One should always question the “experts” - everything should be proved using sample pieces of code if it sounds fishy. It is more of a science rather than an art guys and using code samples and understanding the internals and the boundary conditions is as important as understanding the context in which the recommendations are being applied.

Posted in DB2 LUW, Oracle, SQL Server | 1 Comment »