Systems Engineering and RDBMS

Archive for July, 2008

SQL Server 2008: Converting Binary and Character Values

Posted by decipherinfosys on July 31, 2008

In one of our post, we had covered how to convert a hext string to a varbinary and vice-versa in SQL Server 2005. In SQL Server 2005, if you use the convert function to convert a character string to a varbinary, you will get the ASCII codes of the characters and likewise if you tried to convert from a varbinary to a string using convert, you would have gotten the characters represented by the ASCII codes in the binary value. That is why one needed to use different means like the one we have shown in the previous post. In SQL Server 2008, one can simply use the CONVERT function to do the same thing.

The CONVERT function now has new styles that allows to make this a possibility. The binary styles can have one of the three possible values: 0, 1 or 2. When using 0, the behavior is the same as SQL Server 2005 and prior releases i.e. it translates ASCII characters to binary bytes or binary bytes to ASCII characters.

When you use the value of 1 and are converting a string to a binary, you need to have the “0x” in the input string. And when you use the style of 2, you do not need the “0x” in the input string when converting a string to a binary. And when converting a binary to a string, when using style 1, it includes the “0x” prefix in the resulting string – when you do the same with style 2, it does not include the prefix. Here is an example to help make sense out of the above statements:

SELECT
CONVERT(NVARCHAR(34) , 0xc23eed6b65c93e44a41a2818e274194f, 1) AS BINARY_TO_STRING_STYLE1,
CONVERT(VARBINARY, ‘0xc23eed6b65c93e44a41a2818e274194f’, 1) AS STRING_TO_BINARY_STYLE1,
CONVERT(NVARCHAR(34), 0xc23eed6b65c93e44a41a2818e274194f, 2) AS BINARY_TO_STRING_STYLE2,
CONVERT(VARBINARY, ‘c23eed6b65c93e44a41a2818e274194f’, 2) AS STRING_TO_BINARY_STYLE2

And here are the results:

BINARY_TO_STRING_STYLE1            STRING_TO_BINARY_STYLE1              BINARY_TO_STRING_STYLE2            STRING_TO_BINARY_STYLE2
---------------------------------- ------------------------------------ ---------------------------------- ----------------------------------
0xC23EED6B65C93E44A41A2818E274194F 0xC23EED6B65C93E44A41A2818E274194F   C23EED6B65C93E44A41A2818E274194F   0xC23EED6B65C93E44A41A2818E274194F

Posted in SQL Server | 2 Comments »

SQL Server 2008: The Debugger is back

Posted by decipherinfosys on July 31, 2008

If you have worked in SQL Server 2000 and were used to using the T-SQL debugger, you might have lamented it’s absence in SSMS in SQL Server 2005. In SQL Server 2005, one can debug the stored procedures using Visual Studio though and that works really great as well. In SQL Server 2008, MSFT has re-introduced the debugger into SSMS. In today’s post, we will pick up a stored procedure in the AdventureWorks database and will go through the features of the debugger – it is nothing different than other debuggers that you have used but it is nice to know that for folks who rely on using only SSMS for their stored procedure work, they do not need to use VS to help debug in an effective manner.

In the BOL, all of the information pertaining to the debugger is under Transact-SQL Debugger. If you have both the client tools and the server running on the same machine, then there is no configuration needed to use the debugger. However, if you are using the client tools from one machine to connect to the database server on another machine, then you need to enable port and program exceptions by using the Windows Firewall Control Panel application on both the client and the server.  We will look into those and other security related issues pertaining to the debugger in an upcoming post.  In this post, we just want to go through an example of the debugger.

So, let’s prepare an execution script for executing a stored procedure in the Adventure Works database:

I am using a very simple example just to illustrate the feature:

declare @x datetime
select @x = GETDATE()
exec [uspGetBillOfMaterials] 799, @x

And now, press the debug button as shown in the image below – this will launch the debugger:

Once the debugger launches, you will see the different debug options at the top – Step Into, Step Over, Step Out, breakpoints etc. and at the bottom, you will see two windows – one for the local variables/Watch and the other one a call stack/Breakpoints/Command Window and the Output – these are tabbed interfaces.  Here is the image that shows that:

And now, let’s step into the call to the procedure and you will see the local variable values and in the call stack, you will see the actual call.

This particular procedure only has a CTE which executes and returns back the records so it will exit out as we process through it and will return back the records.  Using the debugger, troubleshooting issues becomes easier especially in the case of larger procedures and the nested procedures which call other procedures or views/functions.

Posted in SQL Server | 6 Comments »

Checking the .Net Framework Version on a Machine

Posted by decipherinfosys on July 30, 2008

This came up in a discussion at a client site while working with them on integrating their application with their client’s application.  How would you determine what version of the .Net Framework is on your machine or the server?

There are a couple of ways to do that.  Registry is one of them.  Do a start/run/regedit and enter.  Then navigate to:

HKLM\Software\Microsoft\Net Framework\NDP

and you will see a child key that will show the .Net framework version that is installed on that machine.  If you have more than one like I do on my machine, then you will see multiple key enteries like this:

And there you can see the other details like the Install Path and the exact version.

Posted in .Net Development | 1 Comment »

Having vs Where Clause

Posted by decipherinfosys on July 29, 2008

Got another question from one of the readers asking about: “whether it is ok to substitute the WHERE clause with the HAVING clause since in the absence of the WHERE clause and a grouping condition, HAVING essentially acts as a WHERE clause onlY? What is the difference in using these two clauses?”

One thing to understand first is the basic function of the WHERE and the HAVING clauses – WHERE is used to apply filter conditions on the table columns and HAVING is used to apply the filters typically after the aggregations are done.  It is also important to understand the progression of the evaluation of the execution plan.  In the presence of both the WHERE and the HAVING clause, the WHERE condition will get evaluated prior to the HAVING condition.  Let’s take an example (using AdventureWorks sample user database in SQL Server 2005):

select Color, SUM(StandardCost) as SUM_STD_COST

from Production.Product

where Color IN (‘Blue’, ‘Black’)

group by Color

And one can also re-write this to use HAVING instead of the WHERE clause:

select Color, SUM(StandardCost) as SUM_STD_COST

from Production.Product

group by Color

having Color IN (‘Blue’, ‘Black’)

The second one evaluates the filter after the grouping and the first one applies the filter and then does the grouping and is more effective.  For this simple SQL statement, if you look at the execution plan, you will see that the optimizer is smart to change the second SQL’s execution plan to do a filter before the grouping since in the absence of the where clause,

From the first SQL:

————————————————————————-
|–Stream Aggregate(GROUP BY:([AdventureWorks].[Production].[Product].[Color]) DEFINE:([Expr1003]=SUM([AdventureWorks].[Production].[Product].[StandardCost])))
|–Sort(ORDER BY:([AdventureWorks].[Production].[Product].[Color] ASC))
|–Clustered Index Scan(OBJECT:([AdventureWorks].[Production].[Product].[PK_Product_ProductID]), WHERE:([AdventureWorks].[Production].[Product].[Color]=N’Black’ OR [AdventureWorks].[Production].[Product].[Color]=N’Blue’))

And for the second SQL (after flushing the cache):

————————————————————————————-
|–Stream Aggregate(GROUP BY:([AdventureWorks].[Production].[Product].[Color]) DEFINE:([Expr1003]=SUM([AdventureWorks].[Production].[Product].[StandardCost])))
|–Sort(ORDER BY:([AdventureWorks].[Production].[Product].[Color] ASC))
|–Clustered Index Scan(OBJECT:([AdventureWorks].[Production].[Product].[PK_Product_ProductID]), WHERE:([AdventureWorks].[Production].[Product].[Color]=N’Black’ OR [AdventureWorks].[Production].[Product].[Color]=N’Blue’))

So, the optimizer does the right thing in this case…however, in the case of complex queries, it would be an issue.  So, use the WHERE clause to filter off the records using indexed columns for good performance and use Having only for applying filters at the top of the aggregations.

Posted in DB2 LUW, Oracle, SQL Server | Leave a Comment »

SSIS: Importing data from a text file using a package

Posted by decipherinfosys on July 28, 2008

Last week, we had covered how to create a SSIS package to export data to a text file from SQL Server. In today’s post, we are going to cover creating a SSIS package to import the data from a text file into the database schema. We have covered the export/import techniques from SQL Server, Oracle and DB2 LUW in several of our posts – this one concentrates on building up a SSIS package to achieve this task.

As described in the earlier posts, SSIS designer contains four design panes, the Control Flow pane, The Dataflow Pane, Event Handler and the Package Explorer. It has also been discussed that the Control Flow pane is used to incorporate the programming logic into the SSIS package while the Data Flow Pane is used to handle the dataflow inside the package. Since our current task is to import the data from a text file source into Sql Server, we are going to make use of Data Flow task in this current example.

For the purpose of our example, I am creating a text file, Applicants.txt which holds the name and address information of the Applicants of Credit Cards. The text file attachment can be found at the end of the post.

Our task is to create a new table in the database and load that data into that table. If the table already exists then our task is to truncate the table and if the table does not exist, we have to create the table. For this purpose we require one Execute Sql Task (configuration of this task is explained in the earlier posts) to create the table if it does not exist and to truncate the table if it exists. After successful completion of this task, we have to load the data into the table. To achieve this we require a data flow task since we are dealing with the data in our package.

The downside of the dataflow task is that we need to have the tables in the database which we use in this task. So meet this requirement, we initially create the table ‘Applications’ in the database AdventureWorks. Later on we are going to drop this table for our testing and do some work around to test our scenario to create the table if it does not exist and load the table with the data in the text file.

Execute the following script in the AdventureWorks database to create the table ‘Applications’

create table Applications
(
[S.No.] int identity(1,1),
ApplicationID uniqueidentifier default newid(),
FirstName varchar(100),
MiddleName char(2),
LastName varchar(100),
AddressLine1 varchar(100),
AddressLine2 varchar(100),
City varchar(100),
State varchar(100),
Country varchar(100)
)
GO

Creation of the SSIS Package:

Open up the SSIS designer by navigating to

Start –> All Programs –> Microsoft Sql Server 2005 –> SQL Server Business Intelligence Development Studio.

This will open up the SSIS designer to create the SSIS package. Create the new project in the SSIS designer.

1. Creation of a new variable:

To create a new variable right click on the control flow pane and click on the Variables option. This will show the Variables window on the left. Create a new variable ‘TableName’ at the package level. We are going to make use of this variable to create the new table and also to specify the table name to be imported, inside the data flow task. The image below shows the creation of the variable:

2. Creating the Table if it does not exist:

The next step is to configure the Execute Sql Task to create the table ‘Applications’ if it does not exist or else truncate the table if exists. To do this drag the Execute Sql Task from the Control Flow Tool box and drop it in the Control Flow design pane. The configuration of the Execute Sql task was discussed in the earlier posts. Configure the Execute Sql Task to point to the AdventureWorks database. This will create a new connection manager which is renamed to DecipherDestination in this package.

Since we are providing the name of the table through the variable ‘TableName’ and creating the table at run time, we cannot directly write the create table script in the SqlStatement editor of the Execute Sql Task. Instead we have to make use of this new feature of SSIS called ‘Expressions Page’. Every control flow task has the Expressions Page, which is used to dynamically alter the task properties at run time.

To make use of this page, open up the Execute Sql Task Editor and Navigate to the Expressions Page. Click on the little ‘+’ sign and open up the expressions editor. In the property value of the expression editor, select the property SqlStatementSource. This property is made use of to generate any Sql statement dynamically. Since we are creating the table at run time we are making use of this property. Open up the property editor by clicking on the ‘…’. Copy and paste the following Sql script in the Expression window.

“if not exists (select 1 from sys.objects where name = ‘Applications’ and type = ‘U’)

BEGIN

create table ” + @[User::TableName] +” ([S.No.] int identity(1,1), ApplicationID uniqueidentifier default newid(), FirstName varchar(100), MiddleName char(2), LastName varchar(100), AddressLine1 varchar(100), AddressLine2 varchar(100),City varchar(100), State varchar(100), Country varchar(100))

END

ELSE

BEGIN

TRUNCATE TABLE “+ @[User::TableName] +”

END”

The image below shows the navigation to the expression window on the “Execute SQL” Task:

Click OK and return to the SSIS Control Flow designer.

3. Importing the Data into the Table using Data Flow Task:

Our next task is to define the data flow task which imports the data from the text file into the physical table we already created. Drag and drop the Data Flow Task from the Control Flow Tool Box into the Control Flow Designer. Double clicking on it will automatically take us to the Data Flow Tab of the SSIS Designer.

3.1 Configuring the Source in the Data Flow task:

On the Data Flow tab, drag and drop the “Flat File Source” from the Data Flow Sources in the Data Flow Tool Box. Double click on the task which opens up the editor for the Flat File Source.

Create the new flat file source by clicking on “New” button on the Flat File Source Editor. The image below shows the configuration of the flat file source connection manager editor:

Specify the name for the Connection Manager and hit the “Browse” button to navigate to the file system location where the text file is stored and click OK. If the text file contains the column names in the first data row then check the box “Column Names in the first data row”. Since in the text file we are using has the column names in the first data row this option is being checked in this example. The image below shows the General Page of the Flat File Connection Manager Editor:

Next click on the Columns Page of the Editor. On this page the row and column delimiters in the text file should be specified. In our text file, the row delimiter is the new line and the column delimiter in the tab space. The image below shows the configuration of the columns page:

We can navigate to the preview page if we wish to preview the data in the flat file. The configuration of any Data Flow Source in the Data Flow design is almost the same. Click on OK to go back to the Data Flow Design Pane

3.2 Configuring the Destination in the Data Flow task:

The next step in our task is to configure the destination for the data to be imported. We can make use of the same connection manager what we created to create/truncate the table since we are importing the data into the same connection (Same Sever, Database).

On the Data Flow tab, drag and drop the “OLE DB Destination” from the Data Flow Sources in the Data Flow Tool Box. Drag the path (green arrow) from the Flat File source and join to the OLE DB Destination. Double click on the task which opens up the editor for the OLE DB Destination.

As discussed, we can make use of the same connection manager which we created earlier for the Execute Sql Task. If we wish to make a different connection we can also create a new connection manager, a process similar to creation of connection manger in the Execute Sql Task. Since we are making use of the variable to store the table name, specify the data access mode to be “Table Name of View Name variable”. And in the Variable property, specify the variable name with holds the destination table name in our example it is ‘TableName’ variable. The image below shows the configuration of the OLEDB Destination:

Navigate to the Mappings page on the OLE DB Destination Editor and map the source columns to the destination columns. If the column names of the source and destination match, by default SQL Server maps them or else we have to map them manually. Since the first two columns are auto generated and the remaining column names matched in our example SQL Server automatically mapped the source and destination columns. The image below shows the Source and Destination Column Mappings:

The image below shows the Data Flow task after the configuration of the source and the destination:

The image below shows the entire package for importing the text file into a physical table inside the database:

Execute the package by hitting F5. The executables will turn green indicating the successful execution of the package. We can go to the database and select the records from the Applications table to verify that the data is imported indeed. What if the table did not exist when we ran this package? Drop the Applications table from the AdventureWorks database and run the SSIS package again. This time is throws an error as shown in the below screen shot stating that the “Table Does Not Exist’.

Even though our package instructs the SSIS Runtime engine to create the table if it does not exist, the package still throws the error back stating that the table does not exist. This happens because the validations are made before the package is run. Since the table does not exist at the compile time the SSIS Engine throws this error.

One way to get around this problem is to delay the validation of the package. That means we are not validating the package until the run time. Each SSIS task (including the SSIS package) has its own properties. The properties window of a particular task can be viewed by right clicking on the task and click on the ‘Properties’ option in the menu. This will display the properties of the executable. So in our example bring up the properties window of the Data Flow Task. In the properties window, and change the DelayValidation property from False to True. The image below shows how to do this:

This will delay the validation of the Data Flow task until run time. This means that even though the table does not exist at compile time the SSIS Engine will not through the error since the package in now validated at compile time. Now execute the package by hitting F5. Now we see that the table is created and the data is populated in the Application table of AdventureWorks database. The image below shows the successful execution of the SSIS package. Note that we changed the name of the Executables to some meaningful names rather than leaving them at default name what SSIS gives.

That’s it folks! In this post we discussed the usage of the data flow task and also looked at the configuration of the Flat File Source and OLE DB destinations. We then discussed the import of the data in the text file into the physical table inside the database by creating the table if it did not exist.

Posted in SQL Server | 25 Comments »

A reporting query

Posted by decipherinfosys on July 24, 2008

This was one of the questions that we got from a reader:

I am working on a HR application and need to prepare a report in which I need to show the data in this format:

DEPT_NBR MAX_HIRE_DATE    MIN(DT)    HIRE_DATE
-------- -------------    ---------  --------
0312     01-JUN-07        12-DEC-02  12-DEC-02
                                     31-MAR-04
                                     30-APR-05
                                     31-MAY-06
                                     01-JUN-07
0313     15-MAY-08        01-NOV-06  12-NOV-06
                                     31-MAR-07
                                     01-NOV-06
                                     31-MAY-07
                                     15-MAY-08

i.e. I want to show per Department, the max hire date, the min hire date and then the hire dates and other employee related information. Problem is that when I use aggregate functions, I cannot have other pieces of information that I need and if I use sub-queries (sub-selects), then the performance is very bad. Is there any way to do this in SQL? I know that this should probably be done on the reporting side rather than the database side since in reporting (SSRS, Crystal or others), one can just put a function to do the max/min over a group but that would also have an overhead so if I can do this in SQL itself, that will be good. Also, I need to be able to have the same SQL for both Oracle and SQL Server since we are a vendor company and our product is used by clients who could be running on either of these platforms.

The answer to this question is – Use Analytics and you can easily do this in SQL – and it can be done for both Oracle and SQL Server (version 2005 and above though). Here is the solution – let’s create the data first (Note to the readers – whenever possible, please do provide the scripts to re-create your scenarios – it helps us save time and get back to you faster with a solution):

SET NOCOUNT ON;
GO
CREATE TABLE dbo.EMP_MASTER (DEPT_NBR NVARCHAR(10), HIRE_DATE DATETIME, FIRST_NAME NVARCHAR(30), LAST_NAME NVARCHAR(30));
INSERT INTO dbo.EMP_MASTER VALUES (‘0312′, ’12-DEC-02’, ‘Joe’, ‘Snyder’);
INSERT INTO dbo.EMP_MASTER VALUES (‘0312′, ’31-MAR-04’, ‘David’, ‘Gilo’);
INSERT INTO dbo.EMP_MASTER VALUES (‘0312′, ’30-APR-05’, ‘Smitha’, ‘Reilly’);
INSERT INTO dbo.EMP_MASTER VALUES (‘0312′, ’31-MAY-06’, ‘Naveen’, ‘Andrews’);
INSERT INTO dbo.EMP_MASTER VALUES (‘0312′, ’01-JUN-07’, ‘Matthew’, ‘Winkle’);
INSERT INTO dbo.EMP_MASTER VALUES (‘0313′, ’12-NOV-06’, ‘Kathy’, ‘Konnor’);
INSERT INTO dbo.EMP_MASTER VALUES (‘0313′, ’31-MAR-07’, ‘Vikesh’, ‘Gupta’);
INSERT INTO dbo.EMP_MASTER VALUES (‘0313′, ’01-NOV-06’, ‘Martha’, ‘Stewart’);
INSERT INTO dbo.EMP_MASTER VALUES (‘0313′, ’31-MAY-07’, ‘Jim’, ‘Diego’);
INSERT INTO dbo.EMP_MASTER VALUES (‘0313′, ’15-MAY-08’, ‘Arthur’, ‘Doyle’);

And here is the code to get the data in the required format:

select
row_number() over (partition by e.dept_nbr order by e.hire_date) as RN,
case when row_number() over (partition by e.dept_nbr order by e.hire_date) = 1 then e.dept_nbr else ” end as dept_nbr,
case when row_number() over (partition by e.dept_nbr order by e.hire_date) = 1 then convert(nvarchar(10), min(e.hire_date) over (partition by e.dept_nbr), 101)
else ”
end as min_hire_date,
case when row_number() over (partition by e.dept_nbr order by e.hire_date) = 1 then convert(nvarchar(10), max(e.hire_date) over (partition by e.dept_nbr), 101)
else ”
end as max_hire_date,
e.hire_date
from dbo.emp_master as e
order by e.dept_nbr, e.hire_date;

RN                   dept_nbr   min_hire_date max_hire_date hire_date
-------------------- ---------- ------------- ------------- -----------------------
1                    0312       12/12/2002    06/01/2007    2002-12-12 00:00:00.000
2                                                           2004-03-31 00:00:00.000
3                                                           2005-04-30 00:00:00.000
4                                                           2006-05-31 00:00:00.000
5                                                           2007-06-01 00:00:00.000
1                    0313       11/01/2006    05/15/2008    2006-11-01 00:00:00.000
2                                                           2006-11-12 00:00:00.000
3                                                           2007-03-31 00:00:00.000
4                                                           2007-05-31 00:00:00.000
5                                                           2008-05-15 00:00:00.000

If you look at the code above, you will see that we are using analytic functions and we partition the data by department first and then order it by the hiring date within that department – we just look for the very first record and then do a min or a max in order to retrieve our values from that group. There are a couple of other ways to achieve the same as well. Above just demonstrates the usage of the analytic functions in making this pretty simple.

Posted in Oracle, SQL Server | Leave a Comment »

Converting Physical Servers to Virtual Machines

Posted by decipherinfosys on July 24, 2008

In almost any virtualization project, you will run into the need of converting physical machines to virtual machines. Both VMWare and Microsoft provide solutions for that. Here is the link that gives the details of the vmware converter. And here is the link for the Microsoft Virtual Server Migration Toolkit and another one that details VMM (Virtual Machine Manager) over here.

We have used the vmware one quite a few times and it is pretty easy to use and very robust as well. It does not require any downtime either. Haven’t had any experience with the MSFT toolkit so will post on it once we play with it in our labs.

Posted in Virtual Server, VMWare | Leave a Comment »

Transparent Data Encryption in SQL Server 2008

Posted by decipherinfosys on July 24, 2008

Transparent Data Encryption (TDE) is another new feature in SQL Server 2008 which implements database level encryption. SQL Server 2005 already had the column level encryption and with TDE we can now prevent scenarios where the backups or the physical media (tape drive) which contains sensitive data is stolen and then someone reads it by restoring the backups or re-attaching the database files. This encryption is done at the page level. The data that is read off the disk is decrypted when reading the data and encrypted when writing it to the disk. The encryption is done using a database encryption key also called as DEK which is protected using a certificate that is stored on the master database of the instance.

One thing to note (and something that we had not realized when initially working with TDE) is that if any user database on an instance uses TDE, then the tempdb system database will also be encrypted. We have not done benchmarks yet to see the performance penalty of this but wanted to point this out so that you are aware of it. The column level encryption which was introduced in SQL Server 2005 offers more granularity but TDE can be useful as well since not only does it protect your back-ups, it allows the ability to search encrypted data.

Let’s create a new database and then a table in it that has credit card information, SSN information etc.:

create database decipher_test
go
use decipher_test
go
create table dbo.test_encryption
(
first_name nvarchar(10),
last_name nvarchar(10),
ssn nvarchar(10),
credit_card_nbr nvarchar(16)
);

set nocount on
go
insert into dbo.test_encryption values (‘Joe’, ‘Snyder’, ‘234423456’, ‘1234567890123456’);

Now, let’s take a backup of this database. After taking the back-up, if we open it up using Notepad, we can easily read this data – see the image below:

Please do note that we searched for “J o e” since this is unicode. The whole point of doing this exercise was to show that the native backups are readable. One can also easily restore this database on another instance and take all the SSN and Credit Card related information. There are third party tools out there that allow for the encryption as well for the backups but that is a discussion for another post. When this functionality is already built in SQL Server 2008, why should we invest in another third party tool.

So, how do we go about using TDE? We first need to create a database master key:

USE MASTER
GO
CREATE MASTER KEY ENCRYPTION BY PASSWORD = ‘supersecret!Nahhhh’;
GO

Next, we will create a certificate to protect the DEK:

CREATE CERTIFICATE DCTEST WITH SUBJECT = ‘DEK DCTEST CERTIFICATE’;
GO

And now, we can set the encryption for the DECIPHER_TEST database by creating the DEK:

USE DECIPHER_TEST
GO
CREATE DATABASE ENCRYPTION KEY
WITH ALGORITHM = AES_128
ENCRYPTION BY SERVER CERTIFICATE DCTEST;
GO

We used 128 bit AES as our encryption algorithm but you can choose from others as well. Those are listed in BOL and you can also access them in the online version over here.

The last step is to bring it on! (Sorry – turn it on!):

ALTER DATABASE DECIPHER_TEST SET ENCRYPTION ON;

And now, if you try to take a backup and open it up with notepad, you will see that the data is not readable. This time, let us try to restore this encrypted database. If I try to restore this on another instance on which the certificate does not exist, it will fail with error message:

“Cannot find server certificate with thumbprint…”

In order to be able to restore that backup on another instance, we will need to backup the certificate as well:

USE MASTER
GO
BACKUP CERTIFICATE DCTEST
TO FILE = ‘E:\DCTEST.cer’
WITH PRIVATE KEY (FILE = ‘E:\DCTEST_Key.pvk’ ,
ENCRYPTION BY PASSWORD = ‘supersecret!Nahhhh’ )
GO

The next step is to then copy the certificate to the other instance, create the master encryption key just like we did above (with a different password):

USE MASTER
GO
CREATE MASTER KEY ENCRYPTION BY PASSWORD = ‘gotit!Nahhhh’;
GO

And now, we can simply import the certificate that we copied over:

USE MASTER
GO
CREATE CERTIFICATE DCTEST
FROM FILE = ‘E:\DCTEST.cer’
WITH PRIVATE KEY (FILE = ‘E:\DCTEST_Key.pvk’ ,
DECRYPTION BY PASSWORD = ‘supersecret!Nahhhh’ );

Instead of “TO FILE”, we used “FROM FILE” and instead of “ENCRYPTION”, we used “DECRYPTION” in order for the import to go through. After the certificate has been created on the other instance, the restore will go through fine and we can look at the data the same was as on the source system.

There is one more feature along the same lines that needs to be mentioned here. Besides using a certificate in the master database to protect the DEK, we can also use what is called as the EKM (Extensible Key Management) feature. EKM allows us to store the keys used to encrypt the data separately as compared to the data that we are protecting. This is made possible by exposing the encryption functionality to the hardware vendors that address EKM using Hardware Security Modules (HSM). More and more acronyms, eh! We will discuss EKM and HSM in a future post.

Posted in SQL Server | 1 Comment »

SSIS: Exporting data to a text file using a package

Posted by decipherinfosys on July 23, 2008

On this blog, we have covered how to export data out from the database and different ways to import data into the database tables from different sources. We have covered that for all the three major RDBMS that we cover on this blog: Oracle, DB2 LUW and MS SQL Server. Sometimes, these tasks are adhoc in nature and sometimes, they need to be more repeatable in nature. In this post, we will cover how to create a SSIS package for exporting the data out to a text file in the case of SQL Server. If you are new to SSIS, see our basics sections here and here. And the details of the Execute SQL task were then covered in a series of posts that you can access here , here and here.

In order to make the process repeatable, one should create a SSIS package for the data exports that one needs to do frequently.  The task we are going to accomplish in this example is to send the information pertaining to customers and the orders placed by the customers. For this purpose we are going to make use of the AdventureWorks Database in SQL Server 2005. First we are going to write a query to retrieve the information required from the database. Typically this query can also be replaced by a database view to retrieve the information from the database. That way, even if the view logic changes, your code in the SSIS package does not need to change.  And if there is a need to have procedural logic in place, then you can use a stored procedure or a UDF.  Here is the query which we are going to use to retrieve the required information and export it to a text file.  Typically, there will be filter criterias and other conditions associated with such queries – the query in this post is just to illustrate the concept.

SELECT

c.CustomerID,Title, FirstName, MiddleName, LastName, c.AccountNumber, C.CustomerType, pa.AddressLine1,
City, StateProvinceCode, CountryRegionCode, PostalCode, SalesOrderNumber, so.AccountNumber as SalesAccountNo,
OrderDate, DueDate, ShipDate, SubTotal, TaxAmt,
Freight, TotalDue

FROM Sales.Customer c
INNER JOIN Sales.CustomerAddress a on a.CustomerID = c. CustomerID
INNER JOIN Person.Contact pc on pc.ContactID = c.CustomerID
INNER JOIN Person.Address pa on pa.AddressID = a. AddressID
INNER JOIN Person.StateProvince sp on sp.StateProvinceID = pa.StateProvinceID
INNER JOIN Sales.SalesOrderHeader so on so.CustomerID = c.CustomerID

ORDER BY CUSTOMERID

The ideal scenario would be sending the order and customer information to the client for the orders placed on the current day.  For this post’s example, entire data in the database is exported to a text file and the text file would be named as ‘CustomerOrders_<mmddyyyy_HHMMSS>’. Since we are dealing with the flow of data in this example our package would require a Data Flow Task.

Creation of the SSIS Package:

Open up the SSIS designer by navigating to

Start => All Programs => Microsoft Sql Server 2005 => SQL Server Business Intelligence Development Studio.

This will open up the SSIS designer to create the SSIS package. Create the new project in the SSIS designer.

1. Creation of a new variable:

To create a new variable right click on the control flow pane and click on the Variables option. This will show the Variables window on the left. Create a new variable ‘FileLocation’ at the package level. We are going to make use of this variable to specify the location of the destination text file. We are going to make use of the location C:\SSIS in our example. The following image shows the creation of the variable.

2. Exporting the Data into the text using Data Flow Task:

Our next task is to define the data flow task which exports the data from the database into the text file pointed by the location in the variable FileLocation. Drag and drop the Data Flow Task from the Control Flow Tool Box into the Control Flow Designer. Double clicking on it will automatically take us to the Data Flow Tab of the SSIS Designer.

3.1 Configuring the Source in the Data Flow task:

On the Data Flow tab, drag and drop the “OLE DB Source” from the Data Flow Sources in the Data Flow Tool Box. Double click on the task which opens up the editor for the OLE DB Source. Create the new OLE DB source connection manager by clicking on “New” button on the OLE DB Source Editor. The creation of the connection managers has been dealt with in detail in the earlier posts. The data access mode option in the OLE DB source editor gives us various options of retrieving the data from the database. The image below shows the various options available to retrieve the data from the database.

The various options available are:

1. Table or View:

This option allows us to choose a database table from which we are going to export the data. This option is useful only when we are retrieving the data from single table or view only.

2. Table Name or View Name Variable:

This option enables us to pass the table or view name as a variable into the SSIS Package. Again this option is useful only when we are retrieving the data from single table or view only.

3. Sql Command:

This option allows us to specify the SQL Command (even a stored procedure) to retrieve the data from the database. This is the most viable option for us to use. Since we are retrieving the data from multiple tables inside the database, we are going to make use of this option.

4. Sql Command from Variable:

This option is very similar to the above one except that we can pass the sql command through a variable.

Choose Sql Command option in the Data Access Mode in the OLE DB Source Editor. It will display the space to write the sql command. Copy the sql command provided earlier in this post and paste it in here. One can use the Build Query option also to build the query. Clicking this opens a GUI to define the tables to retrieve the data from. The image below shows the configured OLE DB Source Transform in SSIS. Click OK to return to the data flow designer pane.

3.2 Configuring the Destination in the Data Flow task:

The next step in our task is to configure the destination for the data to be exported. On the Data Flow tab, drag and drop the “Flat File Destination” from the Data Flow Destinations in the Data Flow Tool Box. Drag the path (green arrow) from the OLE DB Source and join to the Flat File source . Double click on the task which opens up the editor for the Flat File Destination.

Click on New to create the New Flat File Destination Connection Manager. It will open up a dialog which asks us to select if the destination data is delimited or if the data is of fixed filed length. In our example, we will choose the Delimited option.

The image below shows the configuration of connection manager of Flat File Destination.

Click on OK and SSIS will open up the GUI for configuring the connection manager for the Flat File Destination. Rename the Connection Manager Name if required and for the File Name Option, click browse and give the destination file name. For the purpose of our example, we are using the file at C:\SSIS\orders.txt. The following image shows the configuration of Flat File Connection Manager.

Click on Open to go back to the connection Manager.

We have an option to choose the delimiter for the header row in this General Page. Choose the New Line delimiter ({CR}/{LF}) option for our example. Check the ‘Column names in the first row’ if the column names are required in the first row. In this example we are checking this option since we require the column names in the output flat file.

The following image shows the General Page of the Flat File Connection Manager.

Open up the Columns Page on the Left hand side. Here we can choose the row and the column delimiter for the text file. For our example, choose the New Line as the row delimiter and the Tab{T} as the column delimiter. Click OK to go back to the Flat File Destination.

The following image shows the Columns page of the Flat File Connection Manager.

Navigate to the Mappings page on the Flat File Destination Editor and map the source columns to the destination columns. The following image shows the Source and Destination Column Mappings.


The following image shows the Data Flow task after the configuration of the source and the destination.

The following image shows the entire package for exporting the data in the database to the text file.

Execute the package by hitting F5. The executables will turn green indication the successful execution of the package. We can to go the location C:\SSIS and see that the flat file orders.txt is indeed created with the retrieved data from the database.

Now we fulfilled part of our task of exporting the data from the database to the text file. But the issue here is the name of the text file is static. Every time we run this package will overwrite the destination text file which is not what we want. Ideally, we would like to create the text file with the name CustomerOrders_<mmddyyyy_HHMMSS> or something similar to it. This means that we have to create the file name dynamically.

To do this let us go back to the control flow task and add the Script Task to our package. The configuration of the Script Task has been discussed in the earlier posts. In the script task editor, in the Script Page specify the ReadWriteVariables as FileLocation. We are going to make use of this variable to generate the full file name of our destination file. Open up the Script editor by clicking on the Design Script button on the Script Task Editor. This will open up the visual studio where we write the code to generate the file name. Copy and paste the following code to generate the new file name for our destination file based on the format described earlier.

Dts.Variables(“User::FileLocation”).Value = Dts.Variables(“User::FileLocation”).Value.ToString + “\CustomerOrders_” + CStr(Format(Now(), “MMddyyyy_hhmmss”))

This code is going to append the file name generated using the current date and time to the File Location pointed by our variable FileLocation. In this way we can also control the location in which the destination file is stored by just changing the variable value. The variables are very powerful piece of SSIS architecture. They are used to alter the properties of the package at run time.

Our next step is to tell the flat file connection manager to point to this file which we generated in the earlier step. As discussed in the earlier post, in order to dynamically alter the package at run time we have to make use of the expressions page of the SSIS executable.

In our example, we have to open up the Expressions editor for the Flat File Connection Manager. To do this right click on the Flat File Connection Manager and click on the Properties option in the menu. This will open up the properties window for the connection manager. In the properties window click on the ellipse (… ) in the Expressions. This will open the Expression Editor for the connection manager. The following image shows this process.

Select the Connection String Option in the Property of the Editor and click on the ellipse of the property. This will open up the Expression Builder to edit the property value. In the Expression Builder, drag and drop the variable FileLocation. Since the FileLocation variable contains the entire file path and file name of the destination file, the connection manager now will point to that location and file. The image below shows the expression builder for the connection manager.

Click OK and go back to the Control Flow pane. This completes our SSIS package which will export the data from the AdventureWorks database to the text file which is named according our naming convention. Now execute the package by hitting F5. Now we see that the flat file with the required name is generated with the data populated from our AdventureWorks database. The following screen shot shows the successful execution of the SSIS package. Note that we changed the name of the Executables to some meaningful names rather than leaving them at default name what SSIS gives.

In this post we discussed the need of the data flow task and also looked at the configuration of the Flat File Destination and OLE DB Source. We also discussed the export of the data in the text file from the database by naming the text file dynamically based on the current date and time. In the upcoming posts we are going to look at the export of the data from any OLE DB source to other destinations like excel files.

Posted in SQL Server | 44 Comments »

Code Names for SQL Server Releases

Posted by decipherinfosys on July 22, 2008

It was an interesting trip down the memory lane last night when we had a company meeting and were discussing code names for one of our products.  The code names for SQL Server releases came up and we thought we would post it on the blog to see how many people recognize them:

MSFT SQL Server Release     Project Code Name
SQL Server 2008                          Katmai
SQL Server 2005                          Yukon
SQL Server 2000 64 bit                Liberty
SQL Server 2000 32 bit                Shiloh
SQL Server 7.0                             Sphinx
SQL Server 6.5                             Hydra
SQL Server 6.0                             SQL95

Posted in SQL Server | 4 Comments »