Chinese characters converted to squares when using APACHE POI to convert PPT to Image

限于喜欢 提交于 2019-12-04 17:25:49

This seems to be a bug with apache POI. I have added it in bugzilla

https://issues.apache.org/bugzilla/show_bug.cgi?id=54880

Wu Huajie

The problem is not on the POI side, but in the JVM font setting.

You need to set the font to one in the list of JVM fonts (/usr/lib/jvm/jdk1.8.0_20/jre/lib/fonts or similar), such as simsun.ttc.

XSLFTextShape[] phs = slide[i].getPlaceholders();
for (XSLFTextShape ts : phs) {
  java.util.List<XSLFTextParagraph> tpl = ts.getTextParagraphs();
  for(XSLFTextParagraph tp: tpl) {
    java.util.List<XSLFTextRun> trs = tp.getTextRuns();
    for(XSLFTextRun tr: trs) {
      logger.info(tr.getFontFamily());
      tr.setFontFamily("SimSun");
    }
  }
}

The issue is usage of FileOuputStream which will always write data to the file in default system encoding which is most probably ISO-8859_1 for Windows. Chinese characters are not supported by this encoding. You need to create a stream where you can write using UTF-8 encoding which needs creation of reader. I was looking at the API but did not find any methods taking reader as an argument. But check if ImageOutputStream can help you.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!