How do I use Headless Chrome in Chrome 60 on Windows 10?

后端 未结 6 1065
一生所求
一生所求 2020-12-29 00:56

I\'ve been looking at the following article about Headless Chrome:
https://developers.google.com/web/updates/2017/04/headless-chrome

I just upgraded Chrome on Wi

相关标签:
6条回答
  • 2020-12-29 01:42

    With Chrome 61.0.3163.79, if I add --enable-logging then --dump-dom produces output:

    > "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --enable-logging --headless --disable-gpu --dump-dom https://www.chromestatus.com
    <body class="loading" data-path="/features">
    <app-drawer-layout fullbleed="">
    ...
    </script>
    </body>
    

    If you want to programatically control headless Chrome, here's one way to do it with Python3 and Selenium:

    In an Admin cmd window, install Selenium for Python:

    C:\Users\Mark> pip install -U selenium
    

    Download ChromeDriver v2.32 and extract it. I put the chromedriver.exe in C:\Users\Mark, which is where I put this headless.py Python script:

    from selenium import webdriver
    
    options = webdriver.ChromeOptions()
    options.add_argument("headless")  # remove this line if you want to see the browser popup
    driver = webdriver.Chrome(chrome_options = options)
    driver.get('https://www.google.com/')
    print(driver.page_source)
    driver.quit()  # don't miss this, or chromedriver.exe will keep running!
    

    Run it in a normal cmd window:

    C:\Users\Mark> python headless.py
    <!DOCTYPE html><html xmlns="http://www.w3.org/1999/xhtml" ...
    ...  lots and lots of stuff here ...
    ...</body></html>
    
    0 讨论(0)
  • This works for me:

    start chrome --enable-logging --headless --disable-gpu --print-to-pdf=c:\misc\output.pdf https://www.google.com/
    

    ... but only with "start chrome" and "--enable-logging" and with a path (for the pdf) specified - and if the folder "misc" exists on the c-directory.

    Addition: ... the path for the pdf - "c:\misc" above - can of course be replaced with any other folder/dir.

    0 讨论(0)
  • 2020-12-29 01:44

    If you want to dodge on the problem in general, and just use a service of some kind to do the work for you, I'm the author/founder of browserless which attempts to tackle running headless Chrome in a service-like fashion. Other than that it's pretty tough to keep up with the changes and making sure all the appropriate packages and resources are installed to get Chrome running, but definitely doable.

    0 讨论(0)
  • 2020-12-29 01:49

    Current versions (68-70) seem to require --no-sandbox in order to run, without it they do absolutely nothing and hang in the background.

    The full commands I use are:

    chrome --headless --user-data-dir=tmp --no-sandbox --enable-logging --dump-dom https://www.google.com/ > file.html
    chrome --headless --user-data-dir=tmp --no-sandbox --print-to-pdf=whatever.pdf https://www.google.com/
    

    Using --no-sandbox is a pretty bad idea and you should use this only for websites you trust, but sadly it's the only way of making it work at all.

    --user-data-dir=... uses the specified directory instead of the default one, which is likely already in use by your regular browser.

    However, if you're trying to make a PDF from HTML, then this is fairly useless, since you can't remove header and footer (containing text like file:///...) and the only viable solution is to use Puppeteer.

    0 讨论(0)
  • 2020-12-29 01:50

    I know this question is for Windows, but since Google gives this post as the first search result, here's what works on Mac:

    Mac OS X

    /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --headless --dump-dom 'http://www.google.com'
    

    Note you MUST put the http or it won't work.

    Further tips

    To indent the html (which is highly desirable in real pages that are bloated), use tidy:

    /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --headless --dump-dom 'http://www.google.com' | tidy
    

    You can get tidy with:

    brew install tidy
    
    0 讨论(0)
  • 2020-12-29 01:53

    You should be good. Check under the Chrome Version directory

    C:\Program Files (x86)\Google\Chrome\Application\60.0.3112.78
    

    For the command

    chrome --headless --disable-gpu --print-to-pdf https://www.google.com/
    
    C:\Program Files (x86)\Google\Chrome\Application\60.0.3112.78\output.pdf 
    

    Edit: Still execute commands where the chrome executable is, in this instance

     C:\Program Files (x86)\Google\Chrome\Application\
    
    0 讨论(0)
提交回复
热议问题