#PSTip Using XPath in PowerShell, Part 2

XPath can be used to apply ‘filter left’ philosophy to XML documents. For example we can find any h1 element with ‘title’ id using the following syntax:

Select-Xml -Path .\TestPage.xhtml -XPath "//h1[@id = 'title']"

Note: You can find the code and input file here.

As you can see, XPath syntax has two parts: actual path (//h1) and filter (expression in square brackets). This is sufficient for simple documents. Unfortunately most XML documents used in real world scenarios contain XML namespaces. As a minimum there is a default namespace defined. Perfect example is a valid XHTML document that should have proper namespace declaration on html node:

<html xmlns="http://www.w3.org/1999/xhtml">

XML documents that define namespaces (including default namespace) require that we do the same for Select-Xml cmdlet, otherwise our XPath queries won’t work. We pass namespaces definition to Select-Xml using the –Namespace parameter. This parameter accepts a hash table with keys that we will use in our XPath queries as prefix to elements names and values equal to value of xmlns attributes:

Select-Xml -Path .\TestPage.xhtml -XPath "//x:h1[@id = 'title']" -Namespace @{
    x = 'http://www.w3.org/1999/xhtml'
} | ForEach-Object { $_.Node } | Format-Table -AutoSize
id    #text                                 
--    -----                                 
title This is first H1 directly under 'body'

It is important to note that prefix used in XPath is owned by us; the only requirement is that is has to be unique for each namespace.

Filed in: Columns, Tips and Tricks Tags: , , ,

One Response to "#PSTip Using XPath in PowerShell, Part 2"

Leave a Reply

Submit Comment

© 2017 PowerShell Magazine. All rights reserved. XHTML / CSS Valid.
Proudly designed by Theme Junkie.
%d bloggers like this: