Systems Engineering and RDBMS

SSIS: Importing data from a text file using a package

Posted by decipherinfosys on July 28, 2008

Last week, we had covered how to create a SSIS package to export data to a text file from SQL Server. In today’s post, we are going to cover creating a SSIS package to import the data from a text file into the database schema. We have covered the export/import techniques from SQL Server, Oracle and DB2 LUW in several of our posts – this one concentrates on building up a SSIS package to achieve this task.

As described in the earlier posts, SSIS designer contains four design panes, the Control Flow pane, The Dataflow Pane, Event Handler and the Package Explorer. It has also been discussed that the Control Flow pane is used to incorporate the programming logic into the SSIS package while the Data Flow Pane is used to handle the dataflow inside the package. Since our current task is to import the data from a text file source into Sql Server, we are going to make use of Data Flow task in this current example.

For the purpose of our example, I am creating a text file, Applicants.txt which holds the name and address information of the Applicants of Credit Cards. The text file attachment can be found at the end of the post.

Our task is to create a new table in the database and load that data into that table. If the table already exists then our task is to truncate the table and if the table does not exist, we have to create the table. For this purpose we require one Execute Sql Task (configuration of this task is explained in the earlier posts) to create the table if it does not exist and to truncate the table if it exists. After successful completion of this task, we have to load the data into the table. To achieve this we require a data flow task since we are dealing with the data in our package.

The downside of the dataflow task is that we need to have the tables in the database which we use in this task. So meet this requirement, we initially create the table ‘Applications’ in the database AdventureWorks. Later on we are going to drop this table for our testing and do some work around to test our scenario to create the table if it does not exist and load the table with the data in the text file.

Execute the following script in the AdventureWorks database to create the table ‘Applications’

create table Applications
(
[S.No.] int identity(1,1),
ApplicationID uniqueidentifier default newid(),
FirstName varchar(100),
MiddleName char(2),
LastName varchar(100),
AddressLine1 varchar(100),
AddressLine2 varchar(100),
City varchar(100),
State varchar(100),
Country varchar(100)
)
GO

Creation of the SSIS Package:

Open up the SSIS designer by navigating to

Start –> All Programs –> Microsoft Sql Server 2005 –> SQL Server Business Intelligence Development Studio.

This will open up the SSIS designer to create the SSIS package. Create the new project in the SSIS designer.

1. Creation of a new variable:

To create a new variable right click on the control flow pane and click on the Variables option. This will show the Variables window on the left. Create a new variable ‘TableName’ at the package level. We are going to make use of this variable to create the new table and also to specify the table name to be imported, inside the data flow task. The image below shows the creation of the variable:

2. Creating the Table if it does not exist:

The next step is to configure the Execute Sql Task to create the table ‘Applications’ if it does not exist or else truncate the table if exists. To do this drag the Execute Sql Task from the Control Flow Tool box and drop it in the Control Flow design pane. The configuration of the Execute Sql task was discussed in the earlier posts. Configure the Execute Sql Task to point to the AdventureWorks database. This will create a new connection manager which is renamed to DecipherDestination in this package.

Since we are providing the name of the table through the variable ‘TableName’ and creating the table at run time, we cannot directly write the create table script in the SqlStatement editor of the Execute Sql Task. Instead we have to make use of this new feature of SSIS called ‘Expressions Page’. Every control flow task has the Expressions Page, which is used to dynamically alter the task properties at run time.

To make use of this page, open up the Execute Sql Task Editor and Navigate to the Expressions Page. Click on the little ‘+’ sign and open up the expressions editor. In the property value of the expression editor, select the property SqlStatementSource. This property is made use of to generate any Sql statement dynamically. Since we are creating the table at run time we are making use of this property. Open up the property editor by clicking on the ‘…’. Copy and paste the following Sql script in the Expression window.

“if not exists (select 1 from sys.objects where name = ‘Applications’ and type = ‘U’)

BEGIN

create table ” + @[User::TableName] +” ([S.No.] int identity(1,1), ApplicationID uniqueidentifier default newid(), FirstName varchar(100), MiddleName char(2), LastName varchar(100), AddressLine1 varchar(100), AddressLine2 varchar(100),City varchar(100), State varchar(100), Country varchar(100))

END

ELSE

BEGIN

TRUNCATE TABLE “+ @[User::TableName] +”

END”

The image below shows the navigation to the expression window on the “Execute SQL” Task:

Click OK and return to the SSIS Control Flow designer.

3. Importing the Data into the Table using Data Flow Task:

Our next task is to define the data flow task which imports the data from the text file into the physical table we already created. Drag and drop the Data Flow Task from the Control Flow Tool Box into the Control Flow Designer. Double clicking on it will automatically take us to the Data Flow Tab of the SSIS Designer.

3.1 Configuring the Source in the Data Flow task:

On the Data Flow tab, drag and drop the “Flat File Source” from the Data Flow Sources in the Data Flow Tool Box. Double click on the task which opens up the editor for the Flat File Source.

Create the new flat file source by clicking on “New” button on the Flat File Source Editor. The image below shows the configuration of the flat file source connection manager editor:

Specify the name for the Connection Manager and hit the “Browse” button to navigate to the file system location where the text file is stored and click OK. If the text file contains the column names in the first data row then check the box “Column Names in the first data row”. Since in the text file we are using has the column names in the first data row this option is being checked in this example. The image below shows the General Page of the Flat File Connection Manager Editor:

Next click on the Columns Page of the Editor. On this page the row and column delimiters in the text file should be specified. In our text file, the row delimiter is the new line and the column delimiter in the tab space. The image below shows the configuration of the columns page:

We can navigate to the preview page if we wish to preview the data in the flat file. The configuration of any Data Flow Source in the Data Flow design is almost the same. Click on OK to go back to the Data Flow Design Pane

3.2 Configuring the Destination in the Data Flow task:

The next step in our task is to configure the destination for the data to be imported. We can make use of the same connection manager what we created to create/truncate the table since we are importing the data into the same connection (Same Sever, Database).

On the Data Flow tab, drag and drop the “OLE DB Destination” from the Data Flow Sources in the Data Flow Tool Box. Drag the path (green arrow) from the Flat File source and join to the OLE DB Destination. Double click on the task which opens up the editor for the OLE DB Destination.

As discussed, we can make use of the same connection manager which we created earlier for the Execute Sql Task. If we wish to make a different connection we can also create a new connection manager, a process similar to creation of connection manger in the Execute Sql Task. Since we are making use of the variable to store the table name, specify the data access mode to be “Table Name of View Name variable”. And in the Variable property, specify the variable name with holds the destination table name in our example it is ‘TableName’ variable. The image below shows the configuration of the OLEDB Destination:

Navigate to the Mappings page on the OLE DB Destination Editor and map the source columns to the destination columns. If the column names of the source and destination match, by default SQL Server maps them or else we have to map them manually. Since the first two columns are auto generated and the remaining column names matched in our example SQL Server automatically mapped the source and destination columns. The image below shows the Source and Destination Column Mappings:

The image below shows the Data Flow task after the configuration of the source and the destination:

The image below shows the entire package for importing the text file into a physical table inside the database:

Execute the package by hitting F5. The executables will turn green indicating the successful execution of the package. We can go to the database and select the records from the Applications table to verify that the data is imported indeed. What if the table did not exist when we ran this package? Drop the Applications table from the AdventureWorks database and run the SSIS package again. This time is throws an error as shown in the below screen shot stating that the “Table Does Not Exist’.

Even though our package instructs the SSIS Runtime engine to create the table if it does not exist, the package still throws the error back stating that the table does not exist. This happens because the validations are made before the package is run. Since the table does not exist at the compile time the SSIS Engine throws this error.

One way to get around this problem is to delay the validation of the package. That means we are not validating the package until the run time. Each SSIS task (including the SSIS package) has its own properties. The properties window of a particular task can be viewed by right clicking on the task and click on the ‘Properties’ option in the menu. This will display the properties of the executable. So in our example bring up the properties window of the Data Flow Task. In the properties window, and change the DelayValidation property from False to True. The image below shows how to do this:

This will delay the validation of the Data Flow task until run time. This means that even though the table does not exist at compile time the SSIS Engine will not through the error since the package in now validated at compile time. Now execute the package by hitting F5. Now we see that the table is created and the data is populated in the Application table of AdventureWorks database. The image below shows the successful execution of the SSIS package. Note that we changed the name of the Executables to some meaningful names rather than leaving them at default name what SSIS gives.

That’s it folks! In this post we discussed the usage of the data flow task and also looked at the configuration of the Flat File Source and OLE DB destinations. We then discussed the import of the data in the text file into the physical table inside the database by creating the table if it did not exist.

About these ads

24 Responses to “SSIS: Importing data from a text file using a package”

  1. It’s indeed a great post …..
    From the past 2 days, I’m trying to dynamically creating a temporary table and then after the data transferring the data, I wanted to drop that temp table…

    This post hepled me in doing my job with few minutes…

    Thanx again … Cheers… Keep it up…. :)

  2. [...] From Development to Staging/QA. In this post we will be making use of one of the old packages, “Importing data from a text file using a package” which we created in one of the previous [...]

  3. Kishan said

    Thanks dude. nice post.

  4. Anonymous said

    Very good Step By Step Procedure. Thank you

  5. Laxmikant Sharma said

    Excellent post. Really very helpful for beginners. Thank you so much.

  6. anu said

    great post..i would like to know how to overwrite the table each time the package is run…

    im having a job schedual 2 run the package where it grabs data from txt file 2 the db everyday

    say in todays file i have
    book name price
    sql book $4.00
    vb book $5.00

    tomorows file i have
    book name price
    sql book $4.10
    vb book $4.50

    thus when the package has been run for the second time,tomorow
    i want the db to have
    sql book $4.10
    vb book $4.50

    but now i have
    sql book $4.00
    vb book $5.00
    sql book $4.10
    vb book $4.50

    pls help..how do i archive dis…how do i overwrite a column by checking on another main column,can i have a update query anyway in ssis?can this be done?
    any help is greatly appreciated.
    urgently need help..
    thanx in advance

    anu

  7. Sameer said

    Hi Anu,

    Guess you would have got an answer by now. But the simple way is to add Execute Sql Task to your package;

    SQLStatement: delete from TABLE_NAME

    So that every time you run the package it truncates the table before importing the data to db table.

  8. Andy said

    Deleting all the content of table and adding again is a good work around.
    But is it possible to update the item in database instead of deleting and then edit.
    I want this beacuse after the 1st extract, I am storing some information wrt each entry.
    So I cannot delete the item now. So I have to only update it and not delete and add it.
    Can any1 please have some idea’s how do we do this.

  9. Sri said

    This has been very helpful, Thank you

  10. Abhishek M said

    Great one..

    I am having a problem while mapping to the OLE DB Destination. After mapping, I am getting the error as “Columns and cannot convert between unicode and non-unicode string data types.”.
    Pls Help.

    Thanks..

  11. Ron said

    I would like to see an example of stored procedure with a date Parameter specified & the output of the stored procedure exported to a excel CSV file? Can you plese give some guidance on this?

    I tried with Data Flow task & I was stuggling to incorporate date parameters to the ‘Exec MyStoreProcname @MyDateParam1,@MyDateParam2′ within its properties windows
    Thanks for any help.

  12. neo said

    It’s a nice tutorial, but I can’t find the Applicants.txt file. Does anyone have it?

  13. daisy bopanna said

    thanks

  14. Arvind said

    Hi,

    I need to move data from a Excel File to the Staging Layer to the Integration layer. My question is should I have 2 separate tasks one to move data from Source to Staging (SRCtoSTG.dtsx) and the second to move data from Staging to Integration layer (STGtoINTG.dtsx) OR should I have one task to move data from Source to Staging to Integration layers (SRCtoINTG.dtsx).

    Which would be the best Approach?

    Regards,
    Arvind

  15. Alka Garha said

    Really Nice help me alot..

  16. Venkatesh said

    Good Article

  17. SQL Lion said

    To get the workaround and Step by Step description for developing SSIS package in order to overcome the issue with SSIS while importing text files with Flat File Connection Manager and Flat File Source where the “Row Delimiter” property does not work properly for rows having NULL or empty values, follow the below link:

    http://www.sqllion.com/2010/04/ssis-vs-text-file-importing-1/
    Thanks,
    SQL Lion

  18. Gourav said

    Its a nice post for beginners .What i wanna know is how to do the same from two files to two tables
    thanks

  19. Oleg said

    I only got this to work when BypassPrepare is set to false on the execute task

  20. CurlyNP said

    This is very much helpful.

    How to handle errors if the package fails? I want to send Email to concern people that the load is failed, how can I do it?

    Also , I have to translate the data from the text file beofre it is loaded. E.g I have an incoming data like 5A456 means it is 50456, how can I do that?

  21. Anonymous said

    Awesome post!.. keep it up dude ! :)

  22. Anonymous said

    A very explanatory post.I am new to SSIS and have a question regarding this.
    Is it necessary for my text file to have column name on it or have a delimiter.Suppose I have a text file with no column delimiter but a document defines that first 10 characters are id ,next 5 are name etc etc…..
    How do I import such a text file into my database or is it always necessary for the text file to have a column delimiter or a column name?

  23. HR Dept. said

    How do I append a date time column dynamcally while exporting the text file into table ? note that date time column is not in the file.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 78 other followers

%d bloggers like this: