Systems Engineering and RDBMS

Extracting large XML data using bcp

Posted by decipherinfosys on October 4, 2009

Ran into an issue today at the client site.  There was a stored procedure in the outbound interface of the client application which was responsible for collecting event based data, convert it into an XML and then the SSIS package would take that XML document, encrypt it using pgp encryption and upload it to the client’s partner companies.  Everything was working like a charm for the past couple of months till last night, out of memory exceptions started coming.

The reason was that the XML that was generated was close to 200MB that night – this was because of a large amount of event activity in the system which resulted into a generation of a very large XML.  When the SSIS package was trying to get that data out to encrypt it, it started running into the out of memory exceptions.  The fix that we made was pretty straight forward.  We used bcp to extract out the large XML document.  Here are the steps:

1) First, generate the format file.  A very simple way of doing this is to let the bcp command do it for you.  Here is a sample example:

bcp deciphertest.dbo.large_xml_tbl format nul -T -c -x -f “D:\format_files\large_xml_tbl.xml”

This command takes the table large_xml_tbl from the deciphertest database andd dumps the format file “large_xml_tbl.xml” on the D:\format_files folder.  This is just to illustrate the usage.  In real life, one would be using a query to extract the XML document out rather than the table itself.

2) Once the format file has been generated, you can then use the bcp command to extract out the large XML:

bcp deciphertest.dbo.large_xml_tbl out “D:\XML_Data\large_xml_tbl_1004090815.xml” -SDISprod -T -f “D:\format_files\Test.xml”

This took about 4-5 seconds for the 200MB file.  The encryption was done then and file was uploaded using the regular process.

Resources:

  • BCP XML format files post on mssqltips – here.
  • BCP documentation on MSDN – here.

7 Responses to “Extracting large XML data using bcp”

  1. Polprav said

    Hello from Russia!
    Can I quote a post in your blog with the link to you?

  2. anon said

    you may also want to check out vtd-xml, the latest and most advanced xml processing model

    vtd-xml

  3. Gergely said

    Is it possible to write back the exported xml file into the MSSQL table after changing some values in it?

    • Yes, you can. Depending upon your SQL Server version, there are quite a few options available. Seems like you want to make the changes in the exported file and then do an update? Since you want to update the record with a modified XML, you can bulkload into a temp table and then do an update or you can use one of the XML methods available in SQL Server 2005 and above and do an update that way by passing it in as a parameter. In SQL Server 2000, you can use OpenXML.

      Another option is to just fire off an update statement and update the specific attributes in XML data type column in the DB itself rather than exporting it, making the changes and then loading it back in. If you know the updated attributes, this is the best way. Here is a post that explains the usage of modify() method:
      https://decipherinfosys.wordpress.com/2009/04/14/modifying-xml-data-in-sql-server/

      • Gergely Gazda said

        Hey, thanks for the detailed answer.

        I have 2008 Server.

        What I actually already did is like:
        bcp dbo.Customers “SELECT * FOR XML ROOT(‘root’) PATH(”)” queryout “D:\customers.xml” -T

        And result:
        ‹root›‹Id›1223‹/Id›‹name›Smith, John‹/name›…‹/root›

        Thats half of my process. But what I was curious if there is any very simple way to make it reverse (like a one line bcp command) and importing this xml back to MSSQL. Lets imagine the table is originally empty.

      • There are quite a few ways including passing the XML in and using a simple INSERT INTO … SELECT …. FROM @XML.nodes() SQL statement – you can search the blog for nodes and you will see quite a few examples. Using XML methods gives you a lot of flexibility to play with the data.

        Other ways of bulk importing are also there – take a look at this MSDN article: http://msdn.microsoft.com/en-us/library/ms191184.aspx

        Hth

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: