How to download a HTML webpage using Selenium with python?

删除回忆录丶 提交于 2020-01-01 06:47:13

问题


I want to download a webpage using selenium with python. using the following code:

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys

chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_argument('--save-page-as-mhtml')
d = DesiredCapabilities.CHROME
driver = webdriver.Chrome()

driver.get("http://www.yahoo.com")

saveas = ActionChains(driver).key_down(Keys.CONTROL)\
         .key_down('s').key_up(Keys.CONTROL).key_up('s')
saveas.perform()
print("done")

However the above code isnt working. I am using windows 7. Is there any by which i can bring up the 'Save as" Dialog box?

Thanks Karan


回答1:


You can use below code to download page HTML:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.yahoo.com")
with open("/path/to/page_source.html", "w") as f:
    f.write(driver.page_source)

Just replace "/path/to/page_source.html" with desirable path to file and file name

Update

If you need to get complete page source (including CSS, JS, ...), you can use following solution:

pip install pyahk # from command line

Python code:

from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
import ahk

firefox = FirefoxBinary("C:\\Program Files (x86)\\Mozilla Firefox\\firefox.exe")
from selenium import webdriver

driver = web.Firefox(firefox_binary=firefox)
driver.get("http://www.yahoo.com")
ahk.start()
ahk.ready()
ahk.execute("Send,^s")
ahk.execute("WinWaitActive, Save As,,2")
ahk.execute("WinActivate, Save As")
ahk.execute("Send, C:\\path\\to\\file.htm")
ahk.execute("Send, {Enter}")


来源:https://stackoverflow.com/questions/42900214/how-to-download-a-html-webpage-using-selenium-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!