How to change font encoding when converting docx -> pdf with docx4j?

情到浓时终转凉″ 提交于 2019-12-06 13:39:13

My problem was - missing proper True Type Fonts on linux server. The default fonts where inserted instead (without my code pages).

I solved the problem installing the default Ms Windows fonts via ttf-mscorefonts-installer

On debian:

apt-get install ttf-mscorefonts-installer

I have the same problem and found, that as you mentioned by yourself, a font problem. The font on the system needs to support your encoding.

e.g: for documents using the "Arial" font, german umlaut characters are shown as "?".

I found an other solution, to override the PDF font encoding as following:

    //
    // read template
    //
    File docxFile = new File(System.getProperty("user.dir") + "/" + "Test.docx");
    InputStream in = new FileInputStream(docxFile);

    // 
    // prepare document context
    //
    IXDocReport report = XDocReportRegistry.getRegistry().loadReport(in, TemplateEngineKind.Velocity);
    IContext context = report.createContext();
    context.put("name", "Michael Küfner");

    // 
    // generate PDF output
    //
    Options options = Options.getTo(ConverterTypeTo.PDF).via(ConverterTypeVia.XWPF);
    PdfOptions pdfOptions = PdfOptions.create();
    pdfOptions.fontEncoding("iso-8859-15");
    options.subOptions(pdfOptions);     


    OutputStream out = new FileOutputStream(new File(docxFile.getPath() + ".pdf"));
    report.convert(context, options, out);

Try setting the attribute in pdfOptions.fontEndcoding (in my case "iso-8859-15") to your needs.

Setting this to "UTF-8", which seams to be the default, resulted in the same problem with special chars.

Another thing I found:

Using the "Calibri" font, which is default for Word 2007/2010, the problem did not occur, even when using UTF-8 encoding. Maybe the embedded Type-1 Arial Font in iText, which is used for generating PDFs, does not support UTF-8 encoding.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!