In this post you will see the techniques to generate XML data from relational tables. You will be exploring FOR XML clause offered by SQL server. We are not going to concentrate on understanding the syntax of FOR XML rather we will see different modes offered and their usages with practical examples.
Prerequisites:
- Knowledge on XML, XPATH
- Knowledge on SQL Basics
Topic Coverage Level:
Where the below concept can be used:
- To extract relational data in the
form on XML trees
The
concepts describing below thus have a single goal: CONVERTING COLUMNS AND ROWS THAT MAKE UP SETS DERIVED FROM ANY SELECT
STATEMENT IN TO XML TREES.
SQL server
provides statements which help us to fetch data in xml format from relational tables.
On querying a table using select statement without XML techniques would result
the structure data in sets, we can convert the same sets data into xml trees using the below statements
provided by SQL server.
SQL server offers FOR XML clause and you
specify one of these modes:
Let’s see what msdn explains about these modes
when used with FOR XML clause. Please do not panic if you are not able to infer
what actually the below statements explain as we will see all these modes with
practical examples later in this post.
The RAW mode generates a single <row>
element per row in the row set that is returned by the SELECT statement. You
can generate XML hierarchy by writing nested FOR XML queries.
The AUTO mode generates nesting in the
resulting XML by using heuristics based on the way the SELECT statement is
specified. You have minimal control over the shape of the XML generated. The
nested FOR XML queries can be written to generate XML hierarchy beyond the XML
shape that is generated by AUTO mode heuristics.
The EXPLICIT mode allows more control over
the shape of the XML. You can mix attributes and elements at will in deciding
the shape of the XML. It requires a specific format for the resulting rowset
that is generated because of query execution. This rowset format is then mapped
into XML shape. The power of EXPLICIT mode is to mix attributes and elements at
will, create wrappers and nested complex properties, create space-separated
values (for example, Gender attribute may have a list of values like MALE,FEMALE,UNKNOWN),
and mixed contents.
The PATH mode together with the nested FOR
XML query capability provides the flexibility of the EXPLICIT mode in a simpler
manner.
Let’s
start making our hands dirty with playing around by queries. I am running all
my quires against Adventure Works database for all examples below.
Using
RAW Mode with FOR XML
FOR XML RAW
mode transforms each row in the query result set into an XML element that has
the generic identifier <row>, or the optionally provided element name. By
default, each column value in the row set that is not NULL is mapped to an
attribute of the <row> element.
In the
below screen shot we can see the select query that fetches top 5 records from
the Production.Product table. All the columns are holding the values but Size
column has NULL for all the rows( Why are we highlighting NULL here? we will know that shortly)
Let’s
generate the XML tree for the same query.
Query:
SELECT TOP
5 ProductID,Name,ProductNumber,ListPrice,Size,SafetyStockLevel
FROM Production.Product
FOR XML RAW
Ouput:
<row ProductID="1" Name="Adjustable Race" ProductNumber="AR-5381" ListPrice="0.0000" SafetyStockLevel="1000" />
<row ProductID="2" Name="Bearing Ball" ProductNumber="BA-8327" ListPrice="0.0000" SafetyStockLevel="1000" />
<row ProductID="3" Name="BB Ball Bearing" ProductNumber="BE-2349" ListPrice="0.0000" SafetyStockLevel="800" />
<row ProductID="4" Name="Headset Ball Bearings" ProductNumber="BE-2908" ListPrice="0.0000" SafetyStockLevel="800" />
<row ProductID="316" Name="Blade" ProductNumber="BL-2036" ListPrice="0.0000" SafetyStockLevel="800" />
From the above we can see that 5 records are
extracted into 5 xml elements named as row with
columns mapped as attributes. If we want the columns to be represented as xml
elements rather attributes we can add ELEMENTS directive to the FOR XML clause,
each column value will be mapped to a sub element of the <row> element.
Query:
SELECT TOP
5 ProductID,Name,ProductNumber,ListPrice,Size,SafetyStockLevel
FROM Production.Product
FOR XML RAW, ELEMENTS
Ouput:
<row>
<ProductID>1</ProductID>
<Name>Adjustable Race</Name>
<ProductNumber>AR-5381</ProductNumber>
<ListPrice>0.0000</ListPrice>
<SafetyStockLevel>1000</SafetyStockLevel>
</row>
<row>
<ProductID>2</ProductID>
<Name>Bearing Ball</Name>
<ProductNumber>BA-8327</ProductNumber>
<ListPrice>0.0000</ListPrice>
<SafetyStockLevel>1000</SafetyStockLevel>
</row>
<row>
<ProductID>3</ProductID>
<Name>BB Ball Bearing</Name>
<ProductNumber>BE-2349</ProductNumber>
<ListPrice>0.0000</ListPrice>
<SafetyStockLevel>800</SafetyStockLevel>
</row>
<row>
<ProductID>4</ProductID>
<Name>Headset Ball Bearings</Name>
<ProductNumber>BE-2908</ProductNumber>
<ListPrice>0.0000</ListPrice>
<SafetyStockLevel>800</SafetyStockLevel>
</row>
<row>
<ProductID>316</ProductID>
<Name>Blade</Name>
<ProductNumber>BL-2036</ProductNumber>
<ListPrice>0.0000</ListPrice>
<SafetyStockLevel>800</SafetyStockLevel>
</row>
We can see
that column Size is missing in the above xml
output since the value of this column is null for the above records hence they
do not appear in the xml output until unless we explicitly ask for it (this is the reason we explicitly highlighted the NULL earlier).
You can
optionally specify the XSINIL option to map NULL column values in the result
set to an element that has the attribute, xsi:nil="true".
Query:
SELECT TOP
5 ProductID,Name,ProductNumber,ListPrice,Size,SafetyStockLevel
FROM Production.Product
FOR XML RAW, ELEMENTS XSINIL
XML Output:
<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ProductID>1</ProductID>
<Name>Adjustable Race</Name>
<ProductNumber>AR-5381</ProductNumber>
<ListPrice>0.0000</ListPrice>
<Size xsi:nil="true" />
<SafetyStockLevel>1000</SafetyStockLevel>
</row>
<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ProductID>2</ProductID>
<Name>Bearing Ball</Name>
<ProductNumber>BA-8327</ProductNumber>
<ListPrice>0.0000</ListPrice>
<Size xsi:nil="true" />
<SafetyStockLevel>1000</SafetyStockLevel>
</row>
<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ProductID>3</ProductID>
<Name>BB Ball Bearing</Name>
<ProductNumber>BE-2349</ProductNumber>
<ListPrice>0.0000</ListPrice>
<Size xsi:nil="true" />
<SafetyStockLevel>800</SafetyStockLevel>
</row>
<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ProductID>4</ProductID>
<Name>Headset Ball Bearings</Name>
<ProductNumber>BE-2908</ProductNumber>
<ListPrice>0.0000</ListPrice>
<Size xsi:nil="true" />
<SafetyStockLevel>800</SafetyStockLevel>
</row>
<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ProductID>316</ProductID>
<Name>Blade</Name>
<ProductNumber>BL-2036</ProductNumber>
<ListPrice>0.0000</ListPrice>
<Size xsi:nil="true" />
<SafetyStockLevel>800</SafetyStockLevel>
</row>
Now in the output we can see size columns
with NULL are appeared as elements with attribute xsi:nil=”true”.
Now we have a problem with the XML we have generated so far as there
is no single root for the output delivered and also the node element with name row do not really looks good.
Can we make the output more meaningful? The answer is YES and that is what we will discuss below.
Let’s
say we need to capture the output with ROOT ELEMENT <Products> and every record
must have a element name as <Product> instead of <row>, the same can be
achieved with the below query
Query:
SELECT TOP
2 ProductID,Name,ProductNumber,ListPrice,Size,SafetyStockLevel
FROM Production.Product
FOR XML RAW('Product'), ELEMENTS XSINIL, ROOT('Products')
Output:
<Products xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Product>
<ProductID>1</ProductID>
<Name>Adjustable Race</Name>
<ProductNumber>AR-5381</ProductNumber>
<ListPrice>0.0000</ListPrice>
<Size xsi:nil="true" />
<SafetyStockLevel>1000</SafetyStockLevel>
</Product>
<Product>
<ProductID>2</ProductID>
<Name>Bearing Ball</Name>
<ProductNumber>BA-8327</ProductNumber>
<ListPrice>0.0000</ListPrice>
<Size xsi:nil="true" />
<SafetyStockLevel>1000</SafetyStockLevel>
</Product>
</Products>
Till now we are generating only xml data
but some applications might need the XSD (XML Schema Definition) or XDR (XML-Data
Reduced Schema) to perform structural validation. Can we generate XSD or XDR along with xml instance data? The
answer for this question is keywords XMLSCHEMA or XMLDATA.
Let’s generate the
XSD for the above xml instance data.
Query:
SELECT TOP
2 ProductID,Name,ProductNumber,ListPrice,Size,SafetyStockLevel
FROM Production.Product
FOR XML RAW('Product'), ELEMENTS XSINIL, ROOT('Products'), XMLSCHEMA
XML Output with XSD:
<Products xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsd:schema targetNamespace="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" elementFormDefault="qualified">
<xsd:import
namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd" />
<xsd:element
name="Product">
<xsd:complexType>
<xsd:sequence>
<xsd:element
name="ProductID" type="sqltypes:int" nillable="1" />
<xsd:element
name="Name" nillable="1">
<xsd:simpleType
sqltypes:sqlTypeAlias="[AdventureWorks].[dbo].[Name]">
<xsd:restriction
base="sqltypes:nvarchar" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase
IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
<xsd:maxLength
value="50" />
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="ProductNumber" nillable="1">
<xsd:simpleType>
<xsd:restriction
base="sqltypes:nvarchar" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase
IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
<xsd:maxLength
value="25" />
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="ListPrice" type="sqltypes:money" nillable="1" />
<xsd:element
name="Size" nillable="1">
<xsd:simpleType>
<xsd:restriction
base="sqltypes:nvarchar" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase
IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
<xsd:maxLength
value="5" />
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="SafetyStockLevel" type="sqltypes:smallint" nillable="1" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
<Product xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
<ProductID>1</ProductID>
<Name>Adjustable Race</Name>
<ProductNumber>AR-5381</ProductNumber>
<ListPrice>0.0000</ListPrice>
<Size xsi:nil="true" />
<SafetyStockLevel>1000</SafetyStockLevel>
</Product>
<Product xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
<ProductID>2</ProductID>
<Name>Bearing Ball</Name>
<ProductNumber>BA-8327</ProductNumber>
<ListPrice>0.0000</ListPrice>
<Size xsi:nil="true" />
<SafetyStockLevel>1000</SafetyStockLevel>
</Product>
</Products>
We can
see that above output
has XSD embedded into XML output (node xsd:schema is within
Products). When fetched the XSD along with XML you can see that <Product> is now qualified with xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1".
Try playing around the XMLDATA keyword as we are not discussing that here :-)
Oooph.. till now we have seen only RAW
mode.
STAY TUNED FOR OTHER modes.. Will be posted
shortly...
Signing Off
Bala Krishna Ragala