#PSTip Using XPath in PowerShell, Part 2

XPath can be used to apply ‘filter left’ philosophy to XML documents. For example we can find any h1 element with ‘title’ id using the following syntax:

Select-Xml -Path .\TestPage.xhtml -XPath "//h1[@id = 'title']"

Note: You can find the code and input file here.

As you can see, XPath syntax has two parts: actual path (//h1) and filter (expression in square brackets). This is sufficient for simple documents. Unfortunately most XML documents used in real world scenarios contain XML namespaces. As a minimum there is a default namespace defined. Perfect example is a valid XHTML document that should have proper namespace declaration on html node:

<html xmlns="http://www.w3.org/1999/xhtml">

XML documents that define namespaces (including default namespace) require that we do the same for Select-Xml cmdlet, otherwise our XPath queries won’t work. We pass namespaces definition to Select-Xml using the –Namespace parameter. This parameter accepts a hash table with keys that we will use in our XPath queries as prefix to elements names and values equal to value of xmlns attributes:

Select-Xml -Path .\TestPage.xhtml -XPath "//x:h1[@id = 'title']" -Namespace @{
    x = 'http://www.w3.org/1999/xhtml'
} | ForEach-Object { $_.Node } | Format-Table -AutoSize
 
id    #text                                 
--    -----                                 
title This is first H1 directly under 'body'

It is important to note that prefix used in XPath is owned by us; the only requirement is that is has to be unique for each namespace.

About the author: Bartek Bielawski

Bartek is a busy IT Admin working for an international company, Optiver. He loves PowerShell and automation. That love got him the honors of a Microsoft MVP. He shares his knowledge on his blog. You can also find him on Twitter: @bielawb.

Related Posts

%d bloggers like this: