Missing elements when using selenium chrome driver to automatically 'Save as PDF'

限于喜欢 提交于 2019-12-21 03:32:51

问题


I am trying to automatically save a PDF file created with pdftohtmlEX (https://github.com/coolwanglu/pdf2htmlEX) using the selenium (chrome) webdriver.

It almost works except captions of figures and sometimes even part of the figures are missing.

Manually saved:

Automatically saved using selenium & chrome webdriver:

Here is my code (you need the chromium webdriver (http://chromedriver.chromium.org/downloads) in the same folder as this script):

import json
from selenium import webdriver

# print settings: save as pdf, 'letter' formatting
appState = """{
    "recentDestinations": [
        {
            "id": "Save as PDF",
            "origin": "local"
        }
    ],
    "mediaSize": {
        "height_microns": 279400,
        "name": "NA_LETTER",
        "width_microns": 215900,
        "custom_display_name": "Letter"
    },
    "selectedDestinationId": "Save as PDF",
    "version": 2
}"""

appState = json.loads(appState)
profile = {"printing.print_preview_sticky_settings.appState": json.dumps(appState)}
chrome_options = webdriver.ChromeOptions()
chrome_options.add_experimental_option('prefs', profile)
# Enable automatically pressing the print button in print preview
# https://peter.sh/experiments/chromium-command-line-switches/
chrome_options.add_argument('--kiosk-printing')

driver = webdriver.Chrome('./chromedriver', options=chrome_options)
driver.get('http://www.deeplearningbook.org/contents/intro.html')
driver.execute_script('window.print();')
driver.quit()

Sometimes when I manually print this happens, too. But if I then change any of the printing options, the preview reloads and the image captions are there again and stay there no matter what options I further enable/disable.

What I tried so far:

  • different Chrome webdriver versions (71, 72, 73) from this site: http://chromedriver.chromium.org/downloads
  • enable background graphics by adding '"isCssBackgroundEnabled": true' to the appState

回答1:


So, through fiddeling around, I came by the solution by accident. I don't really understand why, but enabling the 'PrintBrowser mode' ("Enables PrintBrowser mode, in which everything renders as though printed.") solves the issue. This may or may have to do with CSS loading properly.

I just need to add chrome_options.add_argument('--enable-print-browser') and all elements are there!



来源:https://stackoverflow.com/questions/54943980/missing-elements-when-using-selenium-chrome-driver-to-automatically-save-as-pdf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!