Quantcast

Querying XML Data with XQuery

Get the WebProNews Newsletter:
[ Technology]

Let’s face it, one of the primary tasks we, as Web developers, are faced with is querying data from some data store and allowing users to view and/or manipulate the information via a Web interface.

Typically, the data stores that we query from are traditional relational databases, like Microsoft SQL Server or Oracle. With relational databases, the de facto means for querying data is SQL. However, with the ever-continuing rise in the popularity of Web services, and the need for a platform-independent, Internet-transferable data representation format, XML data stores are becoming more and more popular. SQL was never designed for querying semi-structured data stores, and therefore is not suitable for querying XML data stores. (For more information on creating and using Web services in ASP.NET be sure to read: Creating and Consuming a Web Service. For more information on XML be sure to read the XML FAQ Category on ASPFAQs.com.)

So, how does one query an XML data store and retrieve results from such a query? Most developers currently use XSLT and XPath to accomplish this task. XPath is a syntax for accessing only parts of an XML document that meet a certain criteria; XSLT is a technology that transforms an XML document from one form to another.

While XSLT and XPath have been in use for a while now, there is a new kid on the block: XML Query, or XQuery for short. XQuery is a querying language designed specifically to work with XML data stores using a SQL-like syntax. As of July 2003, XQuery 1.0 is still under development by the W3C standards body. However, the core features and syntax or XQuery are solid, so now is as good a time as any to learn about XQuery, especially since Yukon, the next version of MS SQL Server, will have built-in XQuery support.

In this article we will examine the XQuery syntax and examine some use cases. Following this we’ll examine Microsoft’s XQuery classes, which are currently available. With these classes, you can start using XQuery in your ASP.NET Web applications today!

XQuery Basics

XQuery is used to query an XML document, so, first things first, we need an XML document to talk about while examining our various queries. For this article, let’s use the following XML document which describes the structure of a file system:

<?xml version="1.0" encoding="utf-8" ?>
<filesystem>
&nbsp&nbsp&nbsp <drive letter="C">
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <folder name="My Programs" />
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <folder name="Games">
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <folder name="Quake">
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <folder name="SavedGames">
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <file>game1.sav</file>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <file>game2.sav</file>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp </folder>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <file>quake.exe</file>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <file>README.txt</file>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <file>EULA.txt</file>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp </folder>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp </folder>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <folder name="Windows">
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <folder name="System32" />
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <file>win.exe</file>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp </folder>
&nbsp&nbsp&nbsp </drive>
&nbsp&nbsp&nbsp <drive letter="D">
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <folder name="Backup">
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <file>2003-06-01.bak</file>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <file>2003-06-07.bak</file>
&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp </folder>
&nbsp&nbsp&nbsp </drive>
</filesystem>

The root element of this XML document is <filesystem>, which contains an arbitrary number of <drive> elements. Each <drive> element, in turn, contains an arbitrary number of <folder> elements, and each <folder> element contains an arbitrary number of <folder> elements and <file> elements.

In its simplest form, an XQuery query can simply be an XPath expression. (If you’re unfamiliar with XPath, I would strongly encourage you to work through this XPath tutorial before continuing.) For example, if we wanted to get a list of all of the files in the C drive, we could use the following XPath expression as our XQuery query:

document("FileSystem.xml")/filesystem/drive[@letter="C"]//file

The document(“FileSystem.xml”) part indicates the XML data store: an XML file named FileSystem.xml. The output of this query, given the FileSystem.xml file above, would be:

<file>game1.sav</file>
<file>game2.sav</file>
<file>quake.exe</file>
<file>README.txt</file>
<file>EULA.txt</file>
<file>win.exe</file>

The output of an XQuery statement is a collection of XML elements. In the above example, it’s a collection of <file> elements. You can add static XML elements by just inserting them in the XQuery query. For example, in the above example, perhaps we want all of the <file> elements to appear within an XML root element titled MyFiles. This could be accomplished with the following XQuery expression:

<MyFiles>
{ document("FileSystem.xml")/filesystem/drive[@letter="C"]//file }
</MyFiles>

With this addition, the output would be:

<MyFiles>
&nbsp&nbsp <file>game1.sav</file>
&nbsp&nbsp <file>game2.sav</file>
&nbsp&nbsp <file>quake.exe</file>
&nbsp&nbsp <file>README.txt</file>
&nbsp&nbsp <file>EULA.txt</file>
&nbsp&nbsp <file>win.exe</file>
</MyFiles>

Note that in our query we used braces ({ … }) around the XPath expression within the <MyFiles> element. The braces denote that the content within the braces is an XQuery expression, and not literal content. For example, had we omitted the braces and used the query:

<MyFiles>
document("FileSystem.xml")/filesystem/drive[@letter="C"]//file
</MyFiles>

The output would have been:

<MyFiles>
&nbsp&nbsp document("FileSystem.xml")/filesystem/drive[@letter="C"]//file
</MyFiles>

FLWR Expressions

While simple XPath expressions are fine and good, the real power of XQuery shines through with FLWR expressions. FLWR stands for For-Let-Where-Return, and is pronounced “flower”. The FLWR expression is akin to SQL’s SELECT query; it allows for XML data to be queried with conditional statements, and then returns a set of XML elements as a result.

Take a moment to think about a SQL SELECT clause. The main ingredients there are the SELECT, FROM, and WHERE clauses. The FROM clause specifies the table(s) to query over. Then, for each row for the table(s) in the FROM clause, the WHERE clause is evaluated. Those rows that pass the evaluation have those fields that are specified in the SELECT clause outputted. FLWR statements are strikingly similar, as we’ll see in a moment.

FLWR, as the acronym implies, has four parts, or clauses, to it:

  • for – the for clause specifies the XML node list to iterate over, and is akin to the SELECT statement’s FROM clause. The list of XML nodes is specified via an XPath expression. For example, if we wanted to iterate over all of the elements, we’d use the XPath expression document(“FileSystem.xml”)//folder.
  • where – the where clause contains an expression that evaluates to a Boolean, just like the WHERE clause in a SQL SELECT statement. Each XML node in the XML node list in the for clause is evaluated by the where clause expression; those that evaluate to True move on, those that don’t are passed over.
  • return – the return clause specifies the content that is returned from the FLRW expression.
  • You may have noticed that I have omitted the let clause. The examples we’ll be looking at in this article will not use the let clause.

    Now that we have briefly examined the three essential parts of the FLWR expressions, let’s see some examples! Here’s a relatively straightforward example, showing how to get all <folder>elements whose name attribute equals “Quake”:

    for $myNode in document("FileSystem.xml")//folder
    where $myNode/@name="Quake"
    return $myNode

    Notice that the for clause has the following form:

    for variableName in nodeList

    In XQuery, variable names are prefixed with $ (i.e., $myNode). The for clause enumerates the node list and for each node in the node list it binds the variable $myNode to the node. Then, in the where and return clauses, $myNode can be used to reference the current node being evaluated. So, the for clause iterates through all of the <folder> elements, and for each element, the where clause asks, “Is this element’s name attribute equal to Quake?”, and if it is, then the return clause outputs the <folder> element.

    The return clause can be more involved. In fact, the return clause can return any XQuery expression. For example, we might want our output to look like the following:

    <folder>
    &nbsp <name>Quake</name>
    &nbsp <files>
    &nbsp&nbsp&nbsp <file>quake.exe</file>
    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <file>README.txt</file>
    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp <file>EULA.txt</file>
    &nbsp </files>
    </folder>

    We could accomplish this using the following XQuery expression:

    for $myNode in document("FileSystem.xml")//folder
    where $myNode/@name="Quake"
    return
    &nbsp <folder>
    &nbsp&nbsp <name>{string($myNode/@name)}</name>
    &nbsp&nbsp <files>
    &nbsp&nbsp &nbsp {$myNode/file}
    &nbsp&nbsp </files>
    &nbsp </folder>

    Realize that FLWR expressions are just as powerful as SQL SELECT queries. FLWRs are capable of joins, subqueries, and set-based operations, just like SELECT queries.

    Now that we’ve quickly looked at the XQuery syntax, let’s turn our attention to using XQuery in .NET! In Part 2 we’ll see how to get Microsoft’s XQuery classes and how to start using them in an ASP.NET Web application!

    *Originally published at 4 Guys From Rolla.com

    Scott Mitchell, author of five ASP/ASP.NET books and founder of 4GuysFromRolla.com, has been working with Microsoft Web technologies for the past five years. An active member in the ASP and ASP.NET community, Scott is passionate about ASP and ASP.NET and enjoys helping others learn more about these exciting technologies. For more on the DataGrid, DataList, and Repeater controls, check out Scott’s book ASP.NET Data Web Controls Kick Start (ISBN: 0672325012). Read his blog at : http://scottonwriting.net

    Querying XML Data with XQuery
    Comments Off
    About Scott Mitchell
    Scott Mitchell, author of five ASP/ASP.NET books and founder of 4GuysFromRolla.com, has been working with Microsoft Web technologies for the past five years. An active member in the ASP and ASP.NET community, Scott is passionate about ASP and ASP.NET and enjoys helping others learn more about these exciting technologies. For more on the DataGrid, DataList, and Repeater controls, check out Scott's book ASP.NET Data Web Controls Kick Start (ISBN: 0672325012). Read his blog at : http://scottonwriting.net WebProNews Writer
    Top Rated White Papers and Resources

    Comments are closed.