Systems Engineering and RDBMS

Reading Text Files using Linked Servers

Posted by decipherinfosys on July 17, 2007

There are multiple ways in SQL Server in which you are interact with text files. One of the common methods is to load up a text file into an interface table using bulk insert or bcp or DTS/SSIS and then operate upon that data set using T-SQL. Another method is to read the text files using linked servers.  This requires the text file to be present either on the database server or a mapped drive that can be used locally on the DB server.  Ideally, we prefer the first route (the SSIS or bcp or bulk insert) since using it, one can get the data into the interface tables faster and then do processing on that data using the T-SQL language.  It also provides the benefits of looking at the failed records later on, doing reports on those, correcting that data and then loading it up into the transactional system.  Moreover, SSIS/bcp/bulk insert are very fast operations with the bulk insert option being the fastest since it is a minimally logged operation.  However, at one of our client sites, the creation of an interface set of tables was not an option and they wanted to automate this task where the file is delivered to a certain folder on the database server on a nightly basis and then processing needs to happen against that file.  We played around with a couple of options and since the file-sizes were that big, we went with the linked server option.

In this blog, we will see how can, we read the data from text file using linked server. Save the following text under C:\ in city.txt file on database server.

Atlanta,GA,USA
Knoxville,TN,USA
MArietta,GA,USA
Edmonton,AB,CANADA
Tempa,FL,USA
Huntsville,AL,USA
Vancouver,BC,CANADA

You can refer to BOL or the previous blog post for complete syntax of all the commands used here. We will just go through steps to create linked server and access the text file using linked server.

Step 1:
Very first step in this case is to create a linked server using sp_addlinkedserver stored procedure. In order to access text file, we need to use Microsoft.Jet.OLEDB.4.0 provider.

EXEC sp_addlinkedserver TEXT_LS,            /** Local Server Name **/
‘Jet 4.0’,          /** server product **/
‘Microsoft.Jet.OLEDB.4.0’,    /** Provider **/
‘C:\’,      /** data source only path name **/
NULL,       /** location **/
‘Text’       /** provider string **/
GO

Step 2:
Create login mapping between logins on the local instance and remote login on the linked server.

EXEC sp_addlinkedsrvlogin
TEXT_LS, /** remote server name **/
false,   /** True or False **/
NULL,    /** local login Name **/
NULL,    /** Remote userName **/
NULL     /** Remote password **/
GO

•    First parameter is Linked server name to which login mapping applies.
•    Second Parameter indicates whether we want to use local login to connect to linked sever or not. We are setting this value false, because we don’t want to use local login.
•    Third parameter is null, it means all local logins can connect to remote server.
•    Last two parameters are remote user name and remote password to connect to remote server. We want to connect to text linked server, so we are keeping these values null.

Execute following procedure to verify whether we see our tables (city.txt) file or not.

EXEC sp_tables_ex TEXT_LS
GO

It will display all the files with .txt extension which resides in data source path name i.e. C:\

TABLE_CAT  TABLE_SCHEM   TABLE_NAME          TABLE_TYPE REMARKS
———- ————- ——————- ———- ——–
NULL       NULL          city#txt            TABLE      NULL
NULL       NULL          flashback_query#txt TABLE      NULL
NULL       NULL          test#txt            TABLE      NULL
NULL       NULL          test_case_srl#txt   TABLE      NULL

Similarly there is another stored procedure sp_columns_ex to verify column information for a specific linked server tables. You can check BOL for explanation and syntax of these procedures.
Now let us issue the query to get data from flat file. There are different ways to execute the query but most commonly approaches are

Using four part naming convention (linked_server_name.catalog.schema.object_name). As we are accessing text file, we don’t have catalog and schema and that is the reason of “…” in the following query.

SELECT * FROM TEXT_LS…city#txt

Or by using OPENQUERY function as shown below.

SELECT * FROM OPENQUERY(TEXT_LS,’SELECT * FROM CITY#TXT’)

Both of these queries will return data of the text file. Practically for all purposes flat file is now treated as table and we can perform any sorting, grouping operations. If file is a fixed format text file than we can use string functions like substring and others to extract out the column values.  One thing to keep in mind that, if first row is not blank or does not contain any column heading, values in the first row will be treated as column heading. To avoid this, make sure that first row is empty.

Using this approach, the nightly run was successfully implemented.  The errors were still logged into the logging tables and reporting was done on those.  However, the bulk insert + interface tables approach is the truly scalable way to build this interface.

Sorry, the comment form is closed at this time.

 
%d bloggers like this: