问题
I have been trying to edit a PDF document to pre-fill form entries. I've got it working (sort of). The text I'm adding, goes in fine. However, other text that was already there seems to have gotten replaced with "&%£!£! symbols. I've worked out that it's something to do with the "contentStream" section in the code below. It seems to be the "setFont" line. If I remove it, the page remains OK... except that the "Hello Richard" text is no longer displayed!
Help please!
package pdfboxtest;
import java.awt.Color;
import java.util.List;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
public class PDFFormFiller {
private static final String R40_NEW_FORM_PATH = "c:\\temp\\hmrc-r40.pdf";
private static final String R40_COMPLETED_FORM_PATH = "c:\\temp\\hmrc-r40-complete.pdf";
public static void main(String[] args) throws Exception {
PDDocument doc = PDDocument.load(R40_NEW_FORM_PATH);
addTextToPage(doc);
doc.save(R40_COMPLETED_FORM_PATH);
doc.close();
}
private static void addTextToPage(PDDocument doc) throws Exception {
List pages = doc.getDocumentCatalog().getAllPages();
PDPage firstPage = (PDPage) pages.get(0);
PDPageContentStream contentStream = new PDPageContentStream(doc, firstPage, true, true);
contentStream.setFont(PDType1Font.HELVETICA_BOLD, 24);
contentStream.beginText();
contentStream.setNonStrokingColor(Color.BLACK);
contentStream.moveTextPositionByAmount(100, 200);
contentStream.drawString("HELLO RICHARD!!");
contentStream.endText();
contentStream.close();
}
}
回答1:
As already assumed in a comment, this is due to a PDFBox issue I described a workaround for in this answer. This issue is still present in the version 1.8.2 of PDFBox but meanwhile has been fixed for versions 1.8.3 and 2.0.0, cf. PDFBOX-1753.
In your case the workaround changes the addTextToPage
method like this:
private static void addTextToPage(PDDocument doc) throws IOException {
List pages = doc.getDocumentCatalog().getAllPages();
PDPage firstPage = (PDPage) pages.get(0);
PDPageContentStream contentStream = new PDPageContentStream(doc, firstPage, true, true);
firstPage.getResources().getFonts(); // <<<<<<
contentStream.setFont(PDType1Font.HELVETICA_BOLD, 24);
contentStream.beginText();
contentStream.setNonStrokingColor(Color.BLACK);
contentStream.moveTextPositionByAmount(100, 200);
contentStream.drawString("HELLO RICHARD!!");
contentStream.endText();
contentStream.close();
}
The added line enforces an initialization which new PDPageContentStream
forgets but setFont
counts on having been done. You can find details in the answer referenced above. You might want to inform PDFBox development.
来源:https://stackoverflow.com/questions/19702671/pdfbox-scrambling-the-text