Systems Engineering and RDBMS

Archive for August, 2007

TOP (N) clause in SQL Server 2005

Posted by decipherinfosys on August 31, 2007

In SQL Server 2005, Microsoft enhanced the TOP clause to be used in the DML statements as well. Before 2005, TOP was allowed only in SELECT statements to restrict the output to the number of rows specified by the TOP operator. Now we can apply TOP operator to INSERT, UPDATE or DELETE statements as well.

Connect to the database using Query Analyzer or Management Studio and create the following table.

CREATE TABLE dbo.TEST
(
TEST_ID   INT IDENTITY(1,1) PRIMARY KEY,
TEST_NAME VARCHAR(50),
TEST_DATE DATETIME
)
GO

We would like to insert only 5 records in the table. In SQL Server 2000, we have to use following syntax.

INSERT INTO DBO.TEST(TEST_NAME)
SELECT TOP 5 TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
ORDER BY TABLE_NAME
GO

But in SQL Server 2005, we can use following syntax. We can even parameterize also as shown below in separate query.

INSERT TOP (5) INTO DBO.TEST(TEST_NAME)
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
ORDER BY TABLE_NAME
GO

OR even better (since we can use a parameter and change the value at execution time):

DECLARE @I INT
SET @I = 5

INSERT TOP (@I) INTO DBO.TEST(TEST_NAME)
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
GO

Pay close attention to syntax. In INSERT clause, value of TOP is enclosed with (). If you leave out parenthesis, you will get an error. Microsoft suggests to use () even for the SELECT also. Current syntax of ‘TOP n’ is supported only for backward compatibility. Parameterized TOP value gives lots of flexibility during DML operations.

Another thing worth noticing is ORDER BY clause of SELECT statement. When TOP is used in SELECT statement with ORDER BY clause, query will return TOP (n) rows from the ordered result set. In our very first example, first 5 Rows from the ordered result set will be inserted into TEST table. While in 2nd and 3rd example, where TOP (n) is used with an INSERT statement, arbitrarily 5 rows will be picked up from the ordered list and will be inserted. This does not guarantee that records inserted will always be first 5 records sorted as per the ORDER BY clause even if SELECT is ordered.  If we have to make sure that TOP (n) rows in correct sort order should be inserted than TOP (n) should always be used with SELECT statement.

Now we will update first three records.

UPDATE TOP (3) TEST
SET TEST_DATE = GETDATE()
GO

Above query will update three records in the table. We can also use TOP percent clause. Following is the delete statement which deletes 20 percent records from the table.

DELETE TOP (20) PERCENT test
FROM dbo.TEST t
INNER JOIN INFORMATION_SCHEMA.TABLES ist
ON ist.table_Name = t.TEST_NAME
GO

In SQL server 2000, we can achieve same results using SET ROWCOUNT syntax. Our update statement shown above can be written as shown below if we are using SQL Server 2000.

SET ROWCOUNT 3
UPDATE TEST
SET TEST_DATE = GETDATE()
SET ROWCOUNT 0
GO

One major pitfall of the above query is row count remains in effect unless specifically it is set back to 0. If we forget to set row count back to 0, all SQL statements defined after setting of non-zero row count value will be affected. This can create unwanted results in the application. In such a scenario, TOP (n) can be very useful.

Another very important use of TOP (n) in DML statements is to limit the batch size when we are performing update or delete on large number of rows say more than a million. Updating or deleting million records can create lock timeout/ lock escalation issues eventually resulting into blocking issues. Moreover transaction log also gets filled up rapidly. By limiting records to a specific batch size of 10000 or 20000 records, not only we can reduce the locking contention, it also speeds up the query as it has to work with a smaller subset of the data.

Posted in SQL Server | 1 Comment »

Checking for the existence of a column in a table

Posted by decipherinfosys on August 30, 2007

As DBAs or database developers, when we need to write up scripts that need to be rolled out from the development environment to QA or production environment, it is always a good idea to have non-destructive DDL scripts in place. One such scenario is an addition of a column to an existing table. We want to add a new column but first we want to make sure that the column does not exist in the table rather than getting an error at the time of running the script.

In this blog post, we will show you how can we check for the existence of a column before adding it to the table using a small code snippet for Oracle and SQL Server. Let us first create a table. Table creation syntax is in Oracle. Please make appropriate changes, if you want to run it on SQL Server.

CREATE TABLE TEST
(
TEST_ID   NUMBER(9) PRIMARY KEY,
TEST_NAME VARCHAR(50)
)
/

We want to add new column TEST_DATE with date data type but we want to do it in non-destructive way so even if somebody runs the script more than once, it does not give any error. Following is the code snippet for Oracle.

DECLARE
v_count NUMBER(1) := 0;

BEGIN
SELECT 1
INTO v_count
FROM USER_TAB_COLUMNS
WHERE table_name = ‘TEST’
AND column_name = ‘TEST_DATE’;
EXCEPTION
WHEN NO_DATA_FOUND THEN
EXECUTE IMMEDIATE ‘ALTER TABLE TEST ADD TEST_DATE DATE’;
END;
/

In the above PL/SQL block, we are checking existence of column in USER_TAB_COLUMNS view. If we get exception then we go ahead and create the column. This way same script can be run multiple times without running into any error.

Syntax for SQL Server is little simpler. We don’t need to create the local variable. Following is the command in SQL Server. Please create the table in SQL Server before running the script.

IF NOT EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = ‘TEST’ AND COLUMN_NAME = ‘TEST_DATE’)
BEGIN
ALTER TABLE TEST ADD TEST_DATE DATETIME
END
GO

For SQL Server we are going against the information_schema.column view to check the existence of the column. SQL Server supports ‘IF EXISTS’ and ‘IF NOT EXISTS’ syntax which helps in making the existence check logic easy to implement. Information_Schema views return the information about the objects which current user has permissions on. ‘Columns’ view contains one row for each column in the database which user has access to.  We can also check against the system tables (sys.objects, sys.columns etc.) but it is not advisable and Microsoft recommends using Information_Schema views for meta-data information.

Posted in Oracle, SQL Server | 1 Comment »

Bind Variables Usage (Parameterized Queries in SQL Server)

Posted by decipherinfosys on August 29, 2007

We cannot emphasize enough how important it is to use parameterized queries (SQL Server lingo) – these are also called as queries that use bind variables ( Oracle lingo). These are critical in dynamic SQLs. The reason that you do not always have to use bind variables in static SQL is that static SQL is, by definition, fixed at compile time. Every time you run such a program, the static SQL will be the same, and it will be 100-percent shareable. So, using static SQL, a query such as:

select * from foo where col1 = ‘some value’

will always be

select * from foo where col1 = ‘some value’

No matter how many times you run that program, that same old query will be executed (reused execution plan).

Using dynamic SQL, you are able to construct statements like the following:

select * from foo where col1 = ‘a’
select * from foo where col1 = ‘b’
select * from foo where col1 = ‘c’

and so on. Each statement is brand new and unique with the only difference being the value that is being used against the column. An example of a parameterized query for the above scenario will be:

select * from foo where col1 = : x;

The parameter x can have different values (a, b, c) at execution time. If you are not using bind variables, then each of these queries will have to be parsed, qualified (names resolved), security checked, optimized and so on. In short, it will be compiled. Every unique statement you execute will have to be compiled every time. Using parameters or parameter markers in the code increases the ability of the relational engine to match new SQL Statements with existing execution plans in the cache thus increasing performance since the hard-parse and compilation steps are now avoided and only the execution step of the SQL occurs.

Given the excessively high amount of latching that not using bind variables creates, scaling up without using the bind variables would be hard in any OLTP system. As you add more and more users without using bind variables, the runtime differences compared with using bind variables will skyrocket. Another term that is used for queries using bind variables is parameterized queries. By using bind variables, the same statement gets parsed and compiled once and executed many times because of the plan re-use. If you are not using parameterized queries, then you will start noticing procedure cache (SQL Server) and shared pool (Oracle) issues as well. Oracle atleast allows us to configure the shared pool, SQL Server has no configuration for putting a cap on the procedure cache.

Another advantage (besides the BIG performance advantage) of using bind variables in the code is security. Using parameters or parameter markers to hold values by end users is more secure than concatenating the values into a string that is then executed using either a data-access API method, or the sp_executesql stored procedure (SQL Server). Not using parameterized queries also exacerbates locking issues in the system because of poor response timings due to the need for repeating the parsing and the compilation steps in the SQL statement processing.

There are scenarios though where bind variable usage can lead to issues and under which you might consider using a constant – see this post for an example of such a scenario.

Posted in DB2 LUW, Oracle, SQL Server | 1 Comment »

Indexes on Computed Columns

Posted by decipherinfosys on August 28, 2007

In one of our previous blog post, we had covered computed columns in detail. Though all the three major databases Oracle, MS SQL Server and DB2 support computed columns, its implementation is different in each one of them. For details on computed columns, look at our previous blog post .

In Oracle, since the behavior of a computed column is achieved via the use of a trigger, the column does not have any limitations if we need to create an index on the computed column. But this is not the case with SQL Server. In order to create an index on a Computed column, the column value or the computed expression should be deterministic. It means that for a given input, computed column should always give the same results. If that is not the case, then computed column is considered non-deterministic and one cannot create index on computed column. Let us illustrate this with the help of an example. Connect to the database using Query Analyzer or Management Studio.

CREATE TABLE dbo.TEST
(
TEST_ID INT IDENTITY(1,1) PRIMARY KEY,
TEST_NAME VARCHAR(10),
CODE_NAME AS REVERSE(TEST_NAME),
TEST_DATE AS GETDATE()
)
GO

In the above table, we are creating two computed columns. Computed column does not contain data type, instead they are defined with ‘AS’ and then column expression. CODE_NAME will be storing reverse value of TEST_NAME column and TEST_DATE will be storing date time at the time of record creation. Using following SQL, we will check column property of all the columns.

SELECT columnproperty(object_id(‘TEST’),’CODE_NAME’,’ISDeterministic’)
UNION ALL
SELECT columnproperty(object_id(‘TEST’),’TEST_DATE’,’ISDeterministic’)
UNION ALL
SELECT columnproperty(object_id(‘TEST’),’TEST_NAME’,’ISDeterministic’)
GO

Output of above SQL will produce 3 records with value 1, 0 and NULL respectively. Value 1 indicates that column is deterministic (Value will be exactly same at any given time), 0 indicates column is non-deterministic (Value will not be same at any given point of time. NULL value indicates that column is not a computed column and hence property value cannot be determined. There is also ‘ISComputed’ column property which indicated whether column is computed column or not. So in short, ISDeterministic property returns 1/0 only for the computed column. As we mentioned earlier, index can be created on the computed column only if it is deterministic. Let us check that by creating index on the table.

CREATE INDEX TEST_IND_1 ON dbo.TEST(CODE_NAME)
GO

Since CODE_NAME is deterministic, index will be created successfully on the column. Next we will create index on TEST_DATE column.

CREATE INDEX TEST_IND_2 ON dbo.TEST(TEST_DATE)
GO

Since CODE_NAME is deterministic, index will be created successfully on the column. Next we will create index on TEST_DATE column. SQL Server will not allow creation of index and will throw following error.

Msg 2729, Level 16, State 1, Line 1
Column ‘TEST_DATE’ in table ‘dbo.TEST’ cannot be used in an index or statistics or as a partition key because it is non-deterministic.

Index on computed column can be real performance enhancer. If query has filter criteria which requires certain function to be applied on the column in the filter criteria, index will not be used even if there is an index on the column in question. It is better to create an extra computed column on the table, create an index on the column and then use computed column in filter criteria. That way proper index will be used for retrieval of the data. Index on computed column is similar to function based index or FBI in Oracle. To learn more about FBI, visit our previous blog post.

Posted in SQL Server | Leave a Comment »

Cross-tab reports: Changing data from row to column or column to row

Posted by decipherinfosys on August 27, 2007

We had covered this topic in the past using the PIVOT and UNPIVOT operators in SQL Server 20005. You can access that whitepaper on our web-site here. We have received some e-mails from our readers asking how to do this in prior versions of SQL Server or how to do this in other RDBMS. We will cover that in this blog post.

As usual let us demonstrate this using an example. We will create two tables and populate it with some data for our example. Tables are not normalized but they are created just to represent the scenario.

CREATE TABLE PRODUCT_COL
(
PRODUCT_ID INT IDENTITY(1,1) NOT NULL,
PRODUCT_DESC VARCHAR(50),
Q1_SALES_AMT NUMERIC(13,2) DEFAULT 0 NOT NULL,
Q2_SALES_AMT NUMERIC(13,2) DEFAULT 0 NOT NULL,
Q3_SALES_AMT NUMERIC(13,2) DEFAULT 0 NOT NULL,
Q4_SALES_AMT NUMERIC(13,2) DEFAULT 0 NOT NULL,
CONSTRAINT PK_PRODUCT_COL PRIMARY KEY(PRODUCT_ID)
)
GO

CREATE TABLE PRODUCT_ROW
(
PRODUCT_ROW_ID INT IDENTITY(1,1) NOT NULL,
PRODUCT_ID INT NOT NULL,
PRODUCT_DESC VARCHAR(50),
Quarter_ID TINYINT DEFAULT 0 NOT NULL,
Quarter_Amt NUMERIC(13,2) DEFAULT 0 NOT NULL,
CONSTRAINT PK_PRODUCT_ROW PRIMARY KEY(PRODUCT_ROW_ID)
)
GO

— Insert into PRODUCT_COL
INSERT INTO PRODUCT_COL(PRODUCT_DESC,Q1_SALES_AMT,Q2_SALES_AMT,Q3_SALES_AMT,Q4_SALES_AMT)
VALUES(‘Sharpie’,130,215,190,300)
GO
INSERT INTO PRODUCT_COL(PRODUCT_DESC,Q1_SALES_AMT,Q2_SALES_AMT,Q3_SALES_AMT,Q4_SALES_AMT)
VALUES(‘Pencils’,2100,918,3280,1315)
GO

— Insert into PRODUCT_ROW
INSERT INTO PRODUCT_ROW(PRODUCT_ID,PRODUCT_DESC,Quarter_ID, Quarter_Amt)
VALUES(1, ‘Sharpie’,1, 130)
GO
INSERT INTO PRODUCT_ROW(PRODUCT_ID,PRODUCT_DESC,Quarter_ID, Quarter_Amt)
VALUES(1, ‘Sharpie’,2, 215)
GO
INSERT INTO PRODUCT_ROW(PRODUCT_ID,PRODUCT_DESC,Quarter_ID, Quarter_Amt)
VALUES(1, ‘Sharpie’,3, 190)
GO
INSERT INTO PRODUCT_ROW(PRODUCT_ID,PRODUCT_DESC,Quarter_ID, Quarter_Amt)
VALUES(1, ‘Sharpie’,4, 300)
GO
INSERT INTO PRODUCT_ROW(PRODUCT_ID,PRODUCT_DESC,Quarter_ID,Quarter_Amt)
VALUES(2,’Pencils’,1,2100)
GO
INSERT INTO PRODUCT_ROW(PRODUCT_ID,PRODUCT_DESC,Quarter_ID,Quarter_Amt)
VALUES(2,’Pencils’,2,918 )
GO
INSERT INTO PRODUCT_ROW(PRODUCT_ID,PRODUCT_DESC,Quarter_ID,Quarter_Amt)
VALUES(2,’Pencils’,3,3280 )
GO
INSERT INTO PRODUCT_ROW(PRODUCT_ID,PRODUCT_DESC,Quarter_ID,Quarter_Amt)
VALUES(2,’Pencils’,4,1315)
GO

Now we will work with queries. Queries we are going to show will work on any databases. We will start with PRODUCT_COL table where quarterly sale is stored in individual column for a given product (in a single row) and we want to show sale for each quarter on individual row. Following query will do that.

SELECT PRODUCT_ID, PRODUCT_DESC, 1 AS Quarter, Q1_SALES_AMT as Quarterly_Sales
FROM PRODUCT_COL
UNION ALL
SELECT PRODUCT_ID, PRODUCT_DESC, 2 AS Quarter, Q2_SALES_AMT as Quarterly_Sales
FROM PRODUCT_COL
UNION ALL
SELECT PRODUCT_ID, PRODUCT_DESC, 3 AS Quarter, Q3_SALES_AMT as Quarterly_Sales
FROM PRODUCT_COL
UNION ALL
SELECT PRODUCT_ID, PRODUCT_DESC, 4 AS Quarter, Q4_SALES_AMT as Quarterly_Sales
FROM PRODUCT_COL
ORDER BY Product_ID
GO

Here is the result.

PRODUCT_ID PRODUCT_DESC Quarter Quarterly_Sales
———– ————— ———– —————–
1 Sharpie 1 130.00
1 Sharpie 2 215.00
1 Sharpie 3 190.00
1 Sharpie 4 300.00
2 Pencils 1 2100.00
2 Pencils 2 918.00
2 Pencils 3 3280.00
2 Pencils 4 1315.00

In above query, we transposed columns to rows using simple UNION ALL operator. To know what things you need to keep in mind when you are using UNION/ UNION ALL, please refer to our previous blog post. Now let us see how we can transform rows into columns. This gets bit trickier than the previous one.

SELECT Product_ID, Product_Desc,
MAX(CASE WHEN Quarter_ID = 1 THEN Quarter_Amt ELSE 0 END) AS Q1_Sales,
MAX(CASE WHEN Quarter_ID = 2 THEN Quarter_Amt ELSE 0 END) AS Q2_Sales,
MAX(CASE WHEN Quarter_ID = 3 THEN Quarter_Amt ELSE 0 END) AS Q3_Sales,
MAX(CASE WHEN Quarter_ID = 4 THEN Quarter_Amt ELSE 0 END) AS Q4_Sales
FROM PRODUCT_ROW
GROUP BY Product_ID, Product_Desc
ORDER BY Product_ID
GO

Results are shown as under.

Product_ID Product_Desc Q1_Sales Q2_Sales Q3_Sales Q4_Sales
———– ————- ——— ——– ——– ———
1 Sharpie 130.00 215.00 190.00 300.00
2 Pencils 2100.00 918.00 3280.00 1315.00

In above query, since we need only one row for each product, we are doing group by on the product_id and product_desc column. To facilitate grouping we have to use aggregate function and hence MAX is used. You can use other aggregation function like SUM or MIN as well in this scenario. But be careful with usage of aggregate function as it may give you wrong result based on your data. Last is the CASE statement which provides the condition for the column. For first column we want to pick up only quarter_id = 1 record. If quarter_id is not 1 then we would like to ignore the value. Similarly we are performing case condition for rest of the quarters. Same result can be achieved using DECODE statement in Oracle but it will become specific to Oracle only.

Posted in DB2 LUW, Oracle, SQL Server | 2 Comments »

Referential Integrity Logic

Posted by decipherinfosys on August 26, 2007

Ran into another one of those scenarios at a client site.  In one of their applications, the application developers had put the foreign key logic in the application i.e. they had not used the foreign keys in the schema itself but instead the application code was doing that work.  Needless to state, there were a lot of issues with data corruption and lost data integrity.  The primary purpose of the usage of any RDBMS is to maintain data integrity – data should be pristine, with all the relationships and constraints enforced at the DB tier.

To give you an example of how easy it is to run into data integrity issues, here is an example:

When trying to delete the data from the parent table, the application was doing a “select count(*) from child_table where fk_col = :pk_col_value;” and then if the count was 0 for all the child tables, it went on it’s merry way to delete the parent record.  If in the meantime i.e. while doing this wonderful count logic, some session was inserting a record for the same parent table, that would very easily lead to orphaned child table records.  Had a FK been used, when trying to delete the parent table, it would enforce locks in the child table to enforce the data integrity.  A good index on the FK column(s) would also ensure good performance.

Not only that, doing constraint validations in the application layer makes it slower and unmanageable as well.  There is a reason why RDBMS have RI implemented.  Implementing your own RI code in the application layer is not only un-necessary, it is also not performant.  Same is also true for the approach that some DBAs take to enforce FKs via the usage of triggers.  That is also extremely inefficient and causes massive serialization issues (usage of full table locks to ensure that no two sessions are working on the same data set).

Posted in DB2 LUW, Oracle, SQL Server | Leave a Comment »

New Data types in SQL Server 2008 – I

Posted by decipherinfosys on August 25, 2007

SQL Server 2008 has quite a few new data-types. We will cover those in 2 parts as we start playing with those in our test environments.

1) DATE and TIME Data Types: Currently, SQL Server supports datetime and smalldatetime but there is no data-type that supports just the DATE part or just the TIME part and that is considered to be de-normalized by some data modelers since the same field is being used to represent both the date and the time piece. Not only that, in certain situations where one needs to record the different times related to a given date, Example: The trucks can take off from the warehouse at 8:00 A.M., 10:00 A.M., 12:00 P.M. and so on for a given day. One needed to store the date repeatedly for just capturing this time specific information. That is no longer the case with the introduction of these 2 new data types in SQL Server 2008. In prior versions, one then needed to apply the CONVERT() function in order to retrieve only what one needed. That won’t be necessary anymore.

The TIME data-type supports upto 100 nano seconds.

2) Datetime2 data type: The name is funny. MSFT could have come up with a better name (this does remind me of the Oracle VARCHAR and VARCHAR2 data-types :-)). This data type extends the granularity of the time. As you might already know, the granularity of the time in the datetime data type is 1/300th of a second i.e. you can store only .000, .003 or .007 seconds. This new data type extends that to 100 nano seconds just like the TIME data type. The fractional part can be specified by declaring the variable or a column as datetime2(n) where n is the precision with valid values from 0 to 7.

3) In order to support the TIMEZONES in a better way, MSFT has introduced DATETIMEOFFSET(n) data type. This support is provided in Oracle using the TIMEZONE family of data-types. In SQL Server 2008, one can use it to provide better support for the applications that require timezone support, Example: A Warehouse in Eastern Timezone that has it’s data center in central timezone and also has people entering data sitting in the Pacific timezone and this data is timezone sensitive due to shipping date and times. Likewise for a voting system with cut off timelines for the voters etc..

There are more new data types: Better User Defined Types Support: HIERARCHYID (for better support for hierarchical data), spatial data types (Geometry, Geography) and support of TYPES (this feature has been available in Oracle for ages now and will help make array processing and nesting of procedures a lot easier)., FILESTREAM (for better binary data storage) and there have been changes to the NTEXT, TEXT and IMAGE data-types as well which we will cover in the next part.

Posted in SQL Server | 1 Comment »

Attaching and Detaching databases in SQL Server

Posted by decipherinfosys on August 24, 2007

There are a lot of options available to the database developer/DBA to move the databases from one instance to another instance.  Backup and restore is one of them, copying them over using DTS or SSIS is another one, having your DDL and DML copy scripts is yet another way to do it.  One of the popular ways to move databases to a different instance on the same server or to a different server is using sp_detachdb and sp_attach_db system stored procedures.  In this blog post, we will go through the steps involved in detaching and attaching databases.

To start with we will create test database. Please log on to Query analyzer or Management studio with proper authorization and issue following command to create the database. Just make sure that you have the proper privileges (CREATE DATABASE OR CREATE ANY DATABASE) assigned or the login that you are using is a member of the sysadmin or dbcreator fixed server roles.  Let’s keep it simple and create a sample TEST database:

USE MASTER
GO
CREATE DATABASE TEST
GO

Now run sp_helpfile system stored procedure to check how many files are created and location of the files.

USE TEST
GO
SP_HELPFILE

Here is the abbreviated output.

TEST        C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\TEST.mdf
TEST_log    C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\ TEST_log.LDF

Output indicates that data file and log file are created in the default location. Once we created the database, we realized that there will not be enough space on c:\ drive so we need to move database files to the d:\ drive. We will achieve this using sp_detach_db and sp_attach_db.

sp_detach_db takes two or three arguments depending upon SQL Sever version. In SQL Server 2000, it takes only two arguments and in 2005 it sakes three arguments.

/* SQLServer 2000 */

sp_detach_db @dbname = ‘dbname’, @skipchecks = ‘skipchecks’

/* SQLServer 2005*/

sp_detach_db @dbname= ‘database_name’, @skipchecks = ‘skipchecks’,
@keepfulltextindexfile = ‘KeepFulltextIndexFile’

•    @dbname is name of the database we want to detach.
•    @skipchecks indicates whether we want to skip UPDATE STATISTICS during detach process or not. It is optional parameter with default value of NULL. If it set to ‘true’, it will skip the UPDATE STATISTICS, if it is set to false, it will run the UPDATE STATISTICS.
•    @ keepfulltextindexfile is also an optional parameter with a default value of true. If it is set to false, it will drop the full text index files and metadata associated with it. It will keep the metadata if option is null or set to true.

Major restriction to detach the datbase is that database should not be in use. You cannot detach the database which is being used actively. In order to detach the databas,e you need to kill all the connections and set the database in single user mode so that further connections are prohibited. For complete list of restricions, please refer to BOL. Following is an example of detaching the database in SQLServer 2005.

USE MASTER
GO
exec sp_detach_db ‘TEST’,’true’,’true’

Now let us move the file to different location on d: drive (D:\SQServer\Data) and then we will use sp_attach_db procedure to attach the database back. Only users who are member of sysadmin and dbcreator fixed server roles can execute this procedure.

sp_attach_db @dbname = ‘dbname’ @filename1 = ‘filename_n’ [,…16]

sp_attach_db can take upto 17 arguments. Database name and 16 file names. If there are more than 16 data files then one has to use CREATE DATABASE command with FOR ATTACH clause to attach the database (please note that in future versions, the CREATE DATABASE … FOR ATTACH command will be the preferred way to go as BOL states that the sp_attach_db command will be deprecated). Execute following procedure to attach the database back.

EXEC sp_attach_db @dbname = N’TEST’,
@filename1 = N’d:\SQLServer\Data\TEST.mdf’,
@filename2 = N’d:\SQLServer\Data\TEST_log.LDF’
GO

Once database is attached and used, you can remove the files from the previous location if you have copied it to the new location to avoid the confusion. Sp_helpfile will indicate the new location of data files. You can also run DBCC CHECKDB command to verify the sanctity of the attached database.

If database is replicated, then one should take little extra care during detach and attach process to make sure that you don’t run into any restricion scenarios. For replicataion consideration, please refer to BOL.  This is much a easier and faster way to move the database to different location compare to backup and restore.

Couple of things to keep in mind:

1)    If you are moving databases from one instance to another, you have to take care of the orphaned users in the database i.e. you will need to re-run a script to aling the logins with the users in the database.
2)    You should update the statistics with a FULL SCAN once the attach is done.
3)    Beware of different collation settings between the source and the destination instance.

Posted in SQL Server | Leave a Comment »

DBMS_APPLICATION_INFO

Posted by decipherinfosys on August 23, 2007

In one of our previous blog post, we discussed about getting the ‘Module’ and ‘Action’ parameters using SYS_CONTEXT function. In this blog post, we will discuss about setting up the value for these parameters using dbms_application_info package. Once we setup the values, we can use it later on to retrieve information from V$session and V$sqlarea using values set earlier.

dbms_application_info package contains several subprogram which are used to setup/retrieve these parameter values. You can obtain list of subprograms of the package using following SQL.

SQL> desc dbms_Application_info

This will list all the subprograms of the package along with their arguments. Following is the list of all procedures in a package.

• READ_CLIENT_INFO
• SET_CLIENT_INFO
• READ_MODULE
• SET_ACTION
• SET_MODULE
• SET_SESSION_LONGOPS

Let us test this using an example. We will start with creating table and test procedure and we will set the module and action in the procedure.

CREATE TABLE TEST
(
TEST_ID NUMBER(9),
TEST_DESC VARCHAR(30),
TEST_DATE DATE,
UserID VARCHAR(15)
);

CREATE OR REPLACE PROCEDURE TEST_PROC
AS
BEGIN
/* Registering Module and Action both */
DBMS_APPLICATION_INFO.SET_MODULE
(
module_name => ‘TEST_PROC’,
action_name => ‘Adding records’
);

INSERT INTO TEST(TEST_ID,TEST_DESC,TEST_DATE)
SELECT ROWNUM, TABLE_NAME, SYSDATE
FROM USER_TABLES
WHERE ROWNUM < 10;

DBMS_APPLICATION_INFO.SET_MODULE(null,null);

/* Registering only Action */
DBMS_APPLICATION_INFO.SET_ACTION(action_name => ‘Update Records’);

UPDATE TEST
SET UserID = USER
WHERE TEST_ID < 10;

DBMS_APPLICATION_INFO.SET_ACTION(null);

END;
/

Once procedure is created successfully execute it from SQL*Plus prompt.

SQL> exec test_proc

In above procedure we are setting value for Module and Action parameters using SET_MODULE and SET_ACTION procedures of the package. We can retrieve these values by either using READ_MODULE procedure or by querying v$sqlarea view. Following is an example to retrieve data back. Query is followed by result.

SQL> SELECT Action, Fetches, Executions, Rows_PRocessed
2 FROM v$Sqlarea
3 WHERE Module = ‘TEST_PROC’
4 OR Action = ‘Update Records’
5 /

ACTION FETCHES EXECUTIONS ROWS_PROCESSED
————— ———- ———- ————–
Update Records 0 1 4
Adding records 0 1 4

We can retrieve any value we are interested in from v$sqlarea for the module and action parameters. Similarly we can set and retrieve the client_info value using SET_CLIENT_INFO and READ_CLIENT_INFO procedures. It will retrieve value from client_info column of v$session table.

This becomes very helpful during performance tuning. By going against v$sqlarea using module and action, we exactly know what procedure or PL/SQL block needs any attention. It also helps us to track resources like cpu time, elapsed time etc, for a specific module.

Posted in Oracle | Leave a Comment »

Blocking on un-committed unique keys

Posted by decipherinfosys on August 22, 2007

This is one of the common issues that application developers and database developers need to keep in mind. Before I start mentioning the two issues (blocking on un-committed primary keys or blocking on un-committed unique keys), here is a sample table that we are going to use as an example for this post (we will be using SQL Server as an example in this case though the same concept applies to Oracle and DB2 LUW as well):

/*******************************************************************
Create a sample table with an auto-incremental Primary Key
and a unique key column COL2
*******************************************************************/

CREATE TABLE TABLEA (COL1 INT NOT NULL IDENTITY PRIMARY KEY, COL2 VARCHAR(10))
GO
CREATE UNIQUE INDEX TABLEA_IND_1 ON TABLEA (COL2)
/*FILEGROUP AND FILLFACTOR AND OTHER SETTINGS*/

/*******************************************************************
–Now, start a transaction
–in Session #1 and insert a value in it
–and do not commit or rollback it yet
*******************************************************************/
BEGIN TRAN
INSERT INTO TABLEA (COL2) VALUES (‘ABC’)

/*ROLLBACK*/

/*******************************************************************
–In Session #2, go ahead and set the lock time out value to 1 and kick off a transaction that tries to insert the same unique key value in the same table
*******************************************************************/

SET LOCK_TIMEOUT 1
GO

–And now, start the transaction and try to insert the same string again
BEGIN TRAN
INSERT INTO TABLEA (COL2) VALUES (‘ABC’)

–This will get blocked and you will get the lock time out message:

Msg 1222, Level 16, State 47, Line 4
Lock request time out period exceeded.
The statement has been terminated.

Let us set the lock timeout to be a higher value for the second session and let’s execute the command from above again and let’s evaluate the lock structures that are causing the blocking:

spid dbid ObjId IndId Type Resource Mode Status
—— —— ———– —— —- ——————————– ——– ——
53 11 1620200822 2 PAG 1:143 IX GRANT
53 11 1115151018 0 TAB IS GRANT

53 11 1620200822 2 KEY (c600b7fc7c5c) X GRANT

53 11 1620200822 1 PAG 1:141 IX GRANT
53 11 1620200822 0 TAB IX GRANT
53 11 1620200822 1 KEY (010086470766) X GRANT
54 11 1620200822 1 KEY (03000d8f0ecc) X GRANT
54 11 1620200822 1 KEY (0400b4b7d951) X GRANT

54 11 1620200822 2 KEY (c600b7fc7c5c) X WAIT

54 11 1620200822 1 KEY (020068e8b274) X GRANT
54 11 1620200822 0 TAB IX GRANT
54 11 1620200822 1 PAG 1:141 IX GRANT
54 11 1620200822 2 PAG 1:143 IX GRANT

As you can see from above, the wait has developed because of the uncommitted key from the first session. The reason for the wait is that the database engine needs to ensure that the uniqueness is maintained. As a result, it will prevent another session from inserting/updating with the same value that will be in violation of that constraint.

Another time that we have seen this issue being created in the application is when the application is not taking advantage of the surrogate auto-inncremental key from the database. The logic implemented by some developers goes like this in that case:

a) Get the MAX() value for the PK column (thus resulting into a full table scan – or even if they do a select top 1 col1 from tablea order by col1 desc, it is bad enough since it will be a scan nevertheless).
b) Add 1 to it in the application layer.
c) Do the insert.

Under such a scenario, since the database is not the one that is generating the key, not only can this lead to primary key violation issues in a concurrent load scenario, this can once again lead to blocking situations related to un-committed keys. The creation of the surrogate keys should be left to the DB – for identity property columns in SQL Server and DB2 LUW as well as the usage of sequences in Oracle/DB2 LUW.

Posted in DB2 LUW, Oracle, SQL Server | Leave a Comment »