Systems Engineering and RDBMS

Archive for June 2nd, 2008

Some more XML enhancements in SQL Server 2008

Posted by decipherinfosys on June 2, 2008

We had covered one of the XML enhancements in SQL Server 2008 in a blog post before. And in case you missed it, you can download all our blog posts on SQL Server 2008 (till 04/30/08 ) from here. The other XML enhancements that we have played with so far are enhancements to XQuery as well as some improved schema validations. In this post, we will cover the improvements in the schema validations.

SQL Server 2005 had introduced the XML Schema Collection which were used for validating the XML data by enforcing compliance with one or more XSD schemas. If look at the blog post mentioned above for XML Schema collection, in SQL Server 2005 itself we were able to do a lot of XML Schema validations. In SQL Server 2008, that support has been extended to add:

a) Support for datetime, date and time validation including timezone information,

b) Better support for union and list types,

c) Support for lax validation.

For (a), date and time data is expressed in this format: 2008-05-31T10:20:00:000Z which stands for 31st of May 2008 at 10:20 in the morning in the UTC (as indicated by Z). If we want to represent say the US EST time, then it would be represented as: 2008-05-31T10:20:00:000-5:00 since EST is 5 hours behind UTC. Also, unlike SQL Server 2005 in which one had to provide timezone information and it always used to normalize it to UTC (so, our EST time: 2008-05-31T10:20:00:000-5:00 would be normalized in SQL Server 2005 to be 2008-05-31T13:20:00:000Z), in SQL Server 2008 one can omit the timezone information and if you do provide it, it is preserved.

For (b), if you recall in SQL Server 2005, one can use the XML schemas to define the data types for your XML data which allows you to assign a set of values to attributes or multi-valued elements. As an example if you are creating a XML schema for an apparel industry database, you can define a type (say availablesizeListType) that restricts the possible values to: XS, S, M, L, XL, XXL. And this can then be used with an element like say PossibleSizeTypes. So, in SQL Server 2005, one would create a schema definition like:

<xs:sizeType name=”availablesizeListType”>
<xs:list>
<xs:sizeType>
<xs:restriction base=”xs:string”>
<xs:enumeration value=”XS”/>
<xs:enumeration value=”S”/>
<xs:enumeration value=”M”/>
<xs:enumeration value=”L”/>
<xs:enumeration value=”XL”/>
<xs:enumeration value=”XXL”/>
</xs:restriction>
</xs:sizeType>
</xs:list>
</xs:sizeType>

And then, using this schema definition, we can create an element that lists all the different sizes:

<PossibleSizeTypes>XS S M L XL XXL</PossibleSizeTypes>

This works perfectly fine but if you are doing this for a business (say a Wal Mart or a Costco etc.) which sells clothes as well as say jewellery or exercise equipment or other items which do not denote the sizes the same way as apparels, one would need to create multiple list types and multiple schema definitions in SQL Server 2005. In SQL Server 2008, one can use union types that contain list types which can be used to merge multiple list type definitions and restrictions into a single type. Here is an example:

CREATE XML SCHEMA COLLECTION ProductSizeSchema AS
N'<?xml version=”1.0″ encoding=”UTF-16″?>
<xs:schema xmlns:xs=”http://www.w3.org/2001/XMLSchema”&gt;
<xs:sizeType name=”availablesizeListType”>
<xs:union>
<xs:sizeType>
<xs:list>
<xs:sizeType>
<xs:restriction base=”xs:integer”>
<xs:enumeration value=”2″/>
<xs:enumeration value=”4″/>
<xs:enumeration value=”6″/>
<xs:enumeration value=”8″/>
<xs:enumeration value=”10″/>
<xs:enumeration value=”12″/>
</xs:restriction>
</xs:sizeType>
</xs:list>
</xs:sizeType>
<xs:sizeType>
<xs:list>
<xs:sizeType>
<xs:restriction base=”xs:string”>
<xs:enumeration value=”XS”/>
<xs:enumeration value=”S”/>
<xs:enumeration value=”M”/>
<xs:enumeration value=”L”/>
<xs:enumeration value=”XL”/>
<xs:enumeration value=”XXL”/>
</xs:restriction>
</xs:sizeType>
</xs:list>
</xs:sizeType>
</xs:union>
</xs:sizeType>
</xs:schema>’

It declares a type availablesizeListType in the schema definition ProductSizeSchema which has a union type that contains the list types – one based off numbers and the other one based off the apparel sizes. This can then be used in the XML with any elements that are based off the availablesizeListType type to contain any kind of list, example:

<Inventory>
<Item>
<ItemName>Spare Part</ItemName>
<PossibleSizeTypes>4 6 8</PossibleSizeTypes>
</Item>
<Item>
<ItemName>Teenager Shirt</ItemName>
<PossibleSizeTypes>XS S M L XL XXL</PossibleSizeTypes>
</Item>
</Inventory>

For (c), in SQL Server 2008, XML Schemas support wildcard sections in XML documents using the any, anyType and anyAttribute declarations. Lax validation is used to validate elements the contents of elements of type anyType. You can read more on this in the online BOL here.

Posted in SQL Server | 7 Comments »

 
Follow

Get every new post delivered to your Inbox.

Join 77 other followers