0

#PSTip Explore TechEd NA 2014 sessions with PowerShell

Note: This tip requires PowerShell 3.0 or above.

TechEd North America 2014 has a nice Content Catalog page. It provides a few useful search criteria to filter the content and find the sessions you are mostly interested in. That’s great, but I wanted to search the content catalog using PowerShell. While we wait for TechEd team to make the content catalog available as an OData web service, we can try to scrap the needed information about sessions from the Content Catalog page.

At first, I’ve tried to fetch the page with the Invoke-WebRequest cmdlet and use the ParsedHtml property and GetElementsByTagName() method, but that was painfully slow. Then I’ve remembered that I have friends, Tobias and Jakub, who are very good with Regular Expressions. Regular Expressions are an esoteric art and difficult to get right, but, luckily, these two guys know their RegEx.

Another challenge was the way how authors of the Content Catalog page deal with the paging. By default, you will get results in the chunks of 10 per result page. With some help of browser’s developer tools, Jakub’s found out we can modify the URL using the “take” parameter and ask for more (there are 600+ session, so we used “take=1000” to get them all).

At the end, we output a custom PowerShell object with sessions’ details and wrap it as a reusable Search-TechEdNA2014ContentCatalog function:

function Search-TechEdNA2014ContentCatalog {

param
(
# by default you'll get all sessions from a content catalog
[string]$Keyword = ''
)

$Uri = 'http://tena2014.eventpoint.com/Topic/List?format=html&take=1000&keyword=' + $Keyword
$Results = Invoke-WebRequest -Uri $Uri

$Results.RawContent -replace "\n"," " -replace "\s+"," " -replace "(?<=\>)\s+" -replace "\s+(?=\<)" -split 'class="topic"' |
 select -skip 1 |
 foreach {
    

        $Speaker     = if ( $_ -match 'Speaker\(s\):.*?</div>' ) { $matches[0] -split "," -replace ".*a href[^>].*?>" -replace "</a.*"  | foreach { $_.Trim() } }
        $Title       = if ( $_ -match 'Class="title".*?href.*?>(.*?)<' ) { $Matches[1].Trim() }
        $Track       = if ( $_ -match "Track:.*?>(.*?)<" ) { $Matches[1].Trim() }
        $SessionType = if ( $_ -match "Session Type:.*?>(.*?)<" ) { $Matches[1].Trim() }
        $Date        = if ( $_ -match 'class="session">(.*?)<' ) { $Matches[1].Trim() }
        $Description = if ( $_ -match 'class="description">(.*?)<' ) { $Matches[1].Trim() }
        
        [pscustomobject]@{
            Date = $Date
            Track = $Track
            SessionType = $SessionType
            Speaker = $speaker
            Title = $Title
            Description = $Description
        }
    }
}

What is the best way to use this function? Pipe its output to a grid view window, do some additional filtering if you like and output the results to the console:

Search-TechEdNA2014ContentCatalog -Keyword powershell |
Out-GridView -PassThru

TechEdNA2014_Content_Catalog

Or, export the results to a CSV file and open it in Excel:

Search-TechEdNA2014ContentCatalog -Keyword azure |
Out-GridView -PassThru |
Select Date, @{n='Speaker(s)';e={$_.Speaker -join ', '}}, Title |
Export-Csv $env:temp\sessions.csv -NoTypeInformation

Invoke-Item $env:temp\sessions.csv
Filed in: Columns, Tips and Tricks Tags: ,

Leave a Reply

Submit Comment

© 2016 PowerShell Magazine. All rights reserved. XHTML / CSS Valid.
Proudly designed by Theme Junkie.
%d bloggers like this: