Powershell: Download or Save source code for whole ie page

ⅰ亾dé卋堺 提交于 2019-11-29 15:48:47
tkrn

After you navigate, check for the Ready State again instead of using a sleep. The same code that you had will work.

It appears after running the code, the sleep may not be long enough if the site is slow to load.

while($ie.ReadyState -ne 4) {start-sleep -m 100}

It also looks like there is another post regarding this innerHTML converts CDATA to comments It looks like some one created a function on that page where you can clean it up. It would be something like this once you have the function declared in your code

htmlWithCDATASectionsToHtmlWithout($ie.Document.body.outerHTML) | Out-File -FilePath c:\sourcecode.txt

I agree with @tkrn regarding using the while loop to wait for IE document to be ready. And for that I recommend to use at least 2 seconds inside the loop.

while($ie.ReadyState -ne 4) {start-sleep -s 2}

Still I found an easier way to get the whole HTML source page exactly from the URL. Here it is:

$ie.Document.parentWindow.execScript("var JSIEVariable = new XMLSerializer().serializeToString(document);", "javascript")
$obj = $ie.Document.parentWindow.GetType().InvokeMember("JSIEVariable", 4096, $null, $ie.Document.parentWindow, $null)
$HTMLDoc = $obj.ToString()

Now, $HTMLDoc has the whole HTML source page intact and you can save it as html file.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!