Picking file names out of a website to download in powershell

岁酱吖の 提交于 2019-12-02 09:18:42

If you are on PowerShell v3 the Invoke-WebRequest cmdlet may be of help.

To get an object representing the website:

Invoke-WebRequest "http://stackoverflow.com/search?tab=newest&q=powershell"

To get all the links in that website:

Invoke-WebRequest "http://stackoverflow.com/search?tab=newest&q=powershell" | select -ExpandProperty Links

And to just get a list of the href elements:

Invoke-WebRequest "http://stackoverflow.com/search?tab=newest&q=powershell" | select -ExpandProperty Links | select href

If you are on PowerShell v2 or earlier you'll have to create an InternetExplorer.Application COM object and use that to navigate the page:

$ie = new-object -com "InternetExplorer.Application"
# sleep for a second while IE launches
Start-Sleep -Seconds 1
$ie.Navigate("http://stackoverflow.com/search?tab=newest&q=powershell")
# sleep for a second while IE opens the page
Start-Sleep -Seconds 1
$ie.Document.Links | select IHTMLAnchorElement_href
# quit IE
$ie.Application.Quit()

Thanks to this blog post where I learnt about Invoke-WebRequest.

Update: One could also download the website source like you posted and then extract the links from the source. Something like this:

$webclient.downloadstring($source) -split "<a\s+" | %{ [void]($_ -match "^href=[`'`"]([^`'`">\s]*)"); $matches[1] }

The -split part splits the source along lines that start with <a followed by one or more spaces. The output is placed in an array which I then pipe through a foreach-object block. Here I match each line on the regexp which extracts the links part and outputs it.

If you want to do more with the output you can pipe it further through another block which does something with it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!