XPath 101 — SitePoint

As an addition to my latest article, this entry will show you how to harness the power of XPath by example.

XPath is a query language for XML, akin to SQL for relational databases (ok, loosely akin!) which is used to extract nodes from an XML file.

Let’s take a look at some examples:

Here’s our XML file to parse:




Be Here Now
Oasis
$9.99


Heathen Chemistry
Oasis
$13.99


Let It Be
The Beatles
$12.99

We can use XPath queries through the XmlDocument.SelectNodes method to return a set of nodes matching our query. Let’s set this up:


string fileName = Server.MapPath("catalog.xml");
XmlDocument doc = new XmlDocument();
doc.Load(fileName);

Our XmlDocument is now ready to query. Rather than explain the specifics of the query language, I think its better to show by example. The full details of XPath can be found here.

OK, let’s select all the CDs in our catalog:


XmlNodeList cdNodes = doc.SelectNodes("catalog/cd");

Easy eh? Notice, we just write out the “path” of where our nodes are found in the XML file, using / to signify a level of hierarchy.

Let’s get a little more complicated. The following XPath expression will select all CDs which are by the artist Oasis:


XmlNodeList cdNodes = doc.SelectNodes("//cd[artist='Oasis']");

Notice the double slash at the start of this expression. The double slash tells XPath to look at any CD element it comes across, regardless of where exactly it is within the hierarchy. In reality, the double slash saves us time by allowing us not to write out the whole hierarchy path (if you did, it would be “catalog/cd[artist=’Oasis’]”)

The second difference with this expression is that we’re asking for all nodes which have an artist subelement equal to Oasis. The square bracket is used to signify any type of query. We can combine these queries using regular “and”s and “or”s.

Lastly, I’ll show how to grab a particular node from an element. The following query will return the price of all Beatles CDs in our catalog:


XmlNodeList cdNodes = doc.SelectNodes("//cd[artist='The Beatles']/price");

So ends a quick guide to XPath :)

Frequently Asked Questions (FAQs) about XPath

What is the difference between absolute and relative XPath?

Absolute XPath is a direct way to find an element or node by providing the complete path from the root element to the desired element. On the other hand, relative XPath is a path from some immediate point to the desired element. It starts with the double forward slash (//), which means it can search the element anywhere from the webpage. Absolute XPath uses a single forward slash (/), which is not preferred due to the high likelihood of failure on slight changes in the structure of the webpage.

How can I use XPath with attributes?

XPath can be used with attributes by using the @ symbol. For example, if you want to select a button with a specific name attribute, you can use the following syntax: //button[@name=’button_name’]. This will select the button with the name attribute ‘button_name’.

What are XPath Axes?

XPath Axes are used to find elements relative to the current node. There are several types of axes available, such as ancestor, ancestor-or-self, attribute, child, descendant, descendant-or-self, following, following-sibling, namespace, parent, preceding, preceding-sibling, and self.

How can I select text with XPath?

To select text with XPath, you can use the text() function. For example, //p[text()=’This is a paragraph.’] will select the paragraph with the exact text ‘This is a paragraph.’.

What is the difference between node() and text() in XPath?

The node() function selects all child nodes of the specified node. On the other hand, the text() function selects all text nodes of the specified node.

How can I use XPath with namespaces?

To use XPath with namespaces, you need to define the namespace in the XPath expression. For example, //ns:element will select the ‘element’ in the namespace ‘ns’.

What are XPath Operators?

XPath Operators are symbols that define a comparison or mathematical operation. For example, +, -, *, div, =, !=, <, >, <=, >=, or, and, mod, etc.

How can I use XPath with XML?

XPath can be used with XML to navigate through elements and attributes. You can use XPath expressions to select nodes or node sets in an XML document.

What is the use of the contains() function in XPath?

The contains() function in XPath is used to select nodes that contain a specified value. The syntax is: contains(haystack, needle). It returns true if the first argument string contains the second argument string, and false otherwise.

How can I select the nth child using XPath?

To select the nth child using XPath, you can use the following syntax: //element[n]. This will select the nth ‘element’ child. For example, //p[1] will select the first ‘p’ child.