Downloading a PDF using Selenium, Chrome and Python

坚强是说给别人听的谎言 提交于 2020-05-09 21:43:52

问题


I tried following previous posts on this topic such as these (post 1, post 2), but I'm still stuck.

My script has to log into a site using a set of credentials, then navigate through some drop down menus to select a report. Once the report is selected, a new window pops up where parameters must be adjusted to generate the report. Once the parameters are set, the same pop up window refreshes with the generated report in PDF format and is displayed using Chrome's built in PDF viewer. I was under the impression that passing certain options to the webdriver would disable this PDF viewer and simply download the file, but the PDF viewer is still being displayed and nothing is automatically downloaded. Surely I'm missing something or I wrote something incorrectly. Here's the jist of my code:

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_experimental_option('prefs',  {
    "download.default_directory": download_dir,
    "download.prompt_for_download": False,
    "download.directory_upgrade": True,
    "plugins.plugins_disabled": ["Chrome PDF Viewer"]
    }
)

browser = webdriver.Chrome(options = chrome_options)

driver = webdriver.Chrome()
driver.get(url)

#In between here are a bunch of steps here that navigates through drop down menus

#This step may not be necessary, but I figured I'd include it to address when the pop up window refreshes and displays the report in PDF format through Chrome's PDF viewer
driver.switch_to.window(driver.window_handles[1])

So, at this point, Chrome still displays the PDF viewer even though I disabled it earlier. Nothing is downloaded, so I'm wondering if I need to provide another line of code or perhaps something else.

Using Selenium version 3.141.0, Python 3.6.4, Chrome webdriver 2.45 on Windows 10.


回答1:


You need to replace "plugins.plugins_disabled": ["Chrome PDF Viewer"]

With:

"plugins.always_open_pdf_externally": True

Hope this helps you!




回答2:


I had a similar problem, which I have solved with the firefox driver in Java. Here is my code:

ffprofile.setPreference("browser.helperApps.neverAsk.saveToDisk","application/pdf");
ffprofile.setPreference("browser.download.folderList", 2);
ffprofile.setPreference("browser.download.manager.showWhenStarting", false);
ffprofile.setPreference("browser.download.dir", "path/to/directory");
ffprofile.setPreference("plugin.scan.plid.all",false);
ffprofile.setPreference("plugin.scan.Acrobat","99.0");
ffprofile.setPreference("pdfjs.disabled",true);

Maybe for you it is an option to use Firefox and the Java->Python translation should be simple.



来源:https://stackoverflow.com/questions/53998690/downloading-a-pdf-using-selenium-chrome-and-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!