Printing Chinese characters in pdfbox

只谈情不闲聊 提交于 2021-01-27 07:30:58

问题


I'm using the following set-up:

  • Java 11.0.1

  • pdfbox 2.0.15

Objective: Rendering a pdf that contains Chinese characters

Problem: java.lang.IllegalArgumentException: U+674E is not available in this font's encoding: WinAnsiEncoding

I already tried:

  • Using different fonts for Chinese character support. The latest one is NotoSansCJKtc-Regular.ttf

  • Set font to unicode as described here: Java: Write national characters to PDF using PDFBox, however the used loadTTF method is deprecated.

  • Using Arial-Unicode-MS_4302.ttf

My code looks like this (shortened a bit):

try (InputStream pdfIn = inputStream; PDDocument pdfDocument =
             PDDocument.load(pdfIn)) {

      PDFont formFont;
      //Check if Chinese characters are present
      if (!Util.containsHanScript(queryString)) {
        formFont = PDType0Font.load(pdfDocument,
            PdfReportGenerator.class.getResourceAsStream("LiberationSans-Regular.ttf"),
            false);
      } else {
        formFont = PDType0Font.load(pdfDocument,
            PdfReportGenerator.class.getResourceAsStream("NotoSansCJKtc-Regular.ttf"),
            false);
      }

        List<PDField> fields = acroForm.getFields();

        //Load fields into Map
        Map<String, PDField> pdfFields = new HashMap<>();
        for (PDField field : fields) {
          String key = field.getPartialName();
          pdfFields.put(key, field);
        }

        PDField currentField = pdfFields.get("someFieldID");
        PDVariableText pdfield = (PDVariableText) currentField;

        PDResources res = acroForm.getDefaultResources();
        String fontName = res.add(formFont).getName();
        String defaultAppearanceString = "/" + fontName + " 10 Tf 0 g";

        pdfield.setDefaultAppearance(defaultAppearanceString);
        pdfield.setValue("李柱");

      acroForm.flatten(fields, true);

      ByteArrayOutputStream pdfOut = new ByteArrayOutputStream();
      pdfDocument.save(pdfOut);
}

Expected result: Chinese characters on pdf.

Actual result: java.lang.IllegalArgumentException: U+674E is not available in this font's encoding: WinAnsiEncoding

So my question is about how to best support rendering of Chinese characters with pdfbox. Any help is appreciated.


回答1:


The following code works for me, it uses the file of PDFBOX-4629:

PDDocument doc = PDDocument.load(new URL("https://issues.apache.org/jira/secure/attachment/12977270/Report_Template_DE.pdf").openStream());
PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
PDVariableText field = (PDVariableText) acroForm.getField("search_query");
List<PDField> fields = acroForm.getFields();
PDFont font = PDType0Font.load(doc, new FileInputStream("c:/windows/fonts/arialuni.ttf"), false);

PDResources res = acroForm.getDefaultResources();
String fontName = res.add(font).getName();
String defaultAppearanceString = "/" + fontName + " 10 Tf 0 g";

field.setDefaultAppearance(defaultAppearanceString);
field.setValue("李柱");

acroForm.flatten(fields, true);
doc.save("saved.pdf");
doc.close();


来源:https://stackoverflow.com/questions/57450039/printing-chinese-characters-in-pdfbox

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!