trouble using xhtml2pdf with unicode

蹲街弑〆低调 提交于 2019-12-04 19:34:27

Inserting following code into html helped me

<style>
@page {
size: a4;
margin: 0.5cm;
}

@font-face {
font-family: "Verdana";
src: url("verdana.ttf");
}

html {
font-family: Verdana;
font-size: 11pt;
}

</style>

in url instead of "verdana.ttf" you should put absolute path to font in your os

OrPo

If anyone in the future tries, like me, to figure out how to PROPERLY create a PDF file that contains Hebrew using xhtml2pdf, here's what worked for me:

  1. First thing: including the fonts settings as described here by @eviltrue in my HTML. This can be any font as long as it supports Hebrew characters, otherwise any Hebrew characters in the input HTML would simply appear as black rectangles in the PDF.

  2. At the time of writing this answer, while it is possible to output Hebrew characters to PDF in xhtml2pdf, Hebrew characters are outputted in revers order, i.e. שלום כיתה א
    would be א התיכ םולש.

At this point I was stuck, but then I stumbled upon this SO asnwer: https://stackoverflow.com/a/15449145/1918837

After installing the python-bidi package, here is an example of a complete solution (used in a python app):

from bidi import algorithm as bidialg
from xhtml2pdf import pisa

HTMLINPUT = """
            <!DOCTYPE html>
            <html>
            <head>
               <meta http-equiv="content-type" content="text/html; charset=utf-8">
               <style>
                  @page {
                      size: a4;
                      margin: 1cm;
                  }

                  @font-face {
                      font-family: DejaVu;
                      src: url(my_fonts_dir/DejaVuSans.ttf);
                  }

                  html {
                      font-family: DejaVu;
                      font-size: 11pt;
                  }
               </style>
            </head>
            <body>
               <div>Something in English - משהו בעברית</div>
            </body>
            </html>
            """

pdf = pisa.CreatePDF(bidialg.get_display(HTMLINPUT, base_dir="L"), outpufile)

# I'm using base_dir="L" so that "< >" signs in HTML tags wouldn't be
flipped by the bidi algorithm

The nice thing about the bidi algorithm is that you can have mixed RTL and LTR languages in the same line (like in the HTML example above) and still have a correctly formatted result.

EDIT: The best way to go now is definitely using wkhtmltopdf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!