How to deal with ContentNotFoundError when using wkhtmltopdf?

心已入冬 提交于 2019-12-06 21:55:20

问题


Can someone tell me how to resolve following issues?

  1. wkhtmltopdf don't have option to pass proxy info (-p or --proxy) unlike in previous versions and its not using system $http_proxy and $https_proxy env variable too.

  2. wkhtmltopdf not working with HTTPS/SSL even though i set LD_LIBRARY_PATH for libssl.so and libcrypto.so

    [deploy@localhost ~]$ wkhtmltopdf https://www.google.co.in google.pdf
    loaded the Generic plugin 
    Loading page (1/2)
    Error: Failed loading page https://www.google.co.in (sometimes it will work just to ignore this error with --load-error-handling ignore)
    Exit with code 1 due to network error: UnknownNetworkError
    

    and

    [deploy@localhost ~]$ wkhtmltoimage https://www.google.co.in sample.jpg
    loaded the Generic plugin 
    Loading page (1/2)
    Error: Failed loading page https://www.google.co.in (sometimes it will work just to ignore this error with --load-error-handling ignore)
    Exit with code 1 due to network error: UnknownNetworkError
    
  3. wkhtmltopdf working partially with HTTP. The output pdf files missing some content/background/positions.

    [deploy@localhost ~]$ wkhtmltopdf http://localhost:8880/ sample.pdf
    loaded the Generic plugin 
    Loading page (1/2)
    Printing pages (2/2)                                               
    Done                                                           
    Exit with code 1 due to network error: ContentNotFoundError
    
    [deploy@localhost ~]$ wkhtmltoimage http://localhost:8880/ sample.jpg
    loaded the Generic plugin 
    Loading page (1/2)
    Rendering (2/2)                                                    
    Done                                                               
    Exit with code 1 due to network error: ContentNotFoundError
    

Note: Im using wkhtmltopdf-0.12.1-1.fc20.x86_64 and qt-4.8.6-10.fc20.x86_64


回答1:


Unfortunately wkhtmltopdf doesn't handle downloading of complex websites, because it's uses Qt/QtWebKit library which seems to have some issues.

One problem is that wkhtmltopdf doesn't support relative addresses (GitHub: #1634, #1886, #2359, QTBUG-46240) such as:

<img src="/images/filetypes/txt.png">
<script src="//cdn.optimizely.com/js/653710485.js">

and it loads them as local. One solution which I've found to this is the correcting html file in-place by ex in-place editor:

ex -V1 page.html <<-EOF
  %s,'//,'http://,ge 
  %s,"//,"http://,ge 
  %s,'/,'http://www.example.com/,ge
  %s,"/,"http://www.example.com/,ge
  wq " Update changes and quit.
EOF

However it won't work for files which have these type of URLs on the remote.

Another problem is that it doesn't handle missing resources. You can try to specify --load-error-handling ignore, but in most cases it doesn't work (see #2051), so this is still outstanding. Workaround is to simply remove these invalid resources, before conversion.

Alternatively to wkhtmltopdf, you can use either htmldoc, PhantomJS with some additional script, for example using rasterize.js:

phantomjs rasterize.js http://example.com/

or dompdf (HTML to PDF converter for PHP, you can install by composer) with sample code below:

<?php
// somewhere early in your project's loading, require the Composer autoloader
// see: http://getcomposer.org/doc/00-intro.md
$HOMEDIR = "/Users/foo";
require $HOMEDIR . '/.composer/vendor/autoload.php';

// disable DOMPDF's internal autoloader if you are using Composer
define('DOMPDF_ENABLE_AUTOLOAD', FALSE);
define('DOMPDF_ENABLE_REMOTE', TRUE);

// include DOMPDF's default configuration
require_once $HOMEDIR . '/.composer/vendor/dompdf/dompdf/dompdf_config.inc.php';

$htmlString = file_get_contents("https://example.com/foo.pdf");

$dompdf = new DOMPDF();
$dompdf->load_html($htmlString);
$dompdf->render();
$dompdf->stream("sample.pdf");



回答2:


my problem was solved removing @font-face from css.



来源:https://stackoverflow.com/questions/25894251/how-to-deal-with-contentnotfounderror-when-using-wkhtmltopdf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!