pdfbox

How to insert invisible text into a PDF?

二次信任 提交于 2019-12-18 08:59:21
问题 I want to insert invisible text into an existing PDF file, to make it searchable. What library should I use? I would appreciate links to specific API methods to use. Free, ideally open source. Thanks a lot! (For the curious: I want to automatically OCR incoming scanned papers and make them searcheable, in an Alfresco repository) 回答1: 3 options. Text render mode 3: "No stroke, no fill". myPdfContentByte.setTextRenderMode(PdfContentByte.TEXT_RENDER_MODE_INVISIBLE); Draw the text behind

How to add a border to a checkbox and make it always visible

此生再无相见时 提交于 2019-12-18 08:59:16
问题 When I create my PDF the checkbox at first doesn't have any appearence. After I click on it (onBlur) there is some kind of shade visible and when it is focussed it isn't visible anymore. How can I make it always visible? And how can I add some kind of border (without doing it manually with a rectangle class)? public class CheckBoxWriter { public static void main(String[] args) throws IOException { PDDocument document = new PDDocument(); PDPage page = new PDPage(); document.addPage(page); //

Radiobutton display problems with PDFBox

烂漫一生 提交于 2019-12-18 07:21:17
问题 I used the code from the answer from this question to create my radiobuttons: How to Create a Radio Button Group with PDFBox 2.0 After I created my PDF and tried to read the (programatically) selected value from it, this code worked fine: PDDocumentCatalog catalog = doc.getDocumentCatalog(); PDAcroForm form = catalog.getAcroForm(); List<PDField> fields = form.getFields(); for(PDField field: fields) { Object value = field.getValueAsString(); String name = field.getFullyQualifiedName(); if

Adding Header to existing PDF File using PDFBox

微笑、不失礼 提交于 2019-12-18 06:57:50
问题 I am trying to add a Header to an existing PDF file. It works but the table header in the existing PDF are messed up by the change in the font. If I remove setting the font then the header doesn't show up. Here is my code: // the document PDDocument doc = null; try { doc = PDDocument.load( file ); List allPages = doc.getDocumentCatalog().getAllPages(); //PDFont font = PDType1Font.HELVETICA_BOLD; for( int i=0; i<allPages.size(); i++ ) { PDPage page = (PDPage)allPages.get( i ); PDRectangle

How to Create a Radio Button Group with PDFBox 2.0

流过昼夜 提交于 2019-12-18 06:39:36
问题 I want to create a Radio Button group using PDFBox 2.0, I am able to create 3 Radio Buttons, but I can't figure out how to group them (PDFBox 1.8, used PDRadioCollection, but 2.0 doesn't.). How do you create a Radio Button Group with PDFBox 2.0? Here is my current code: PDDocument document = new PDDocument(); PDPage page = new PDPage(PDRectangle.A4); document.addPage(page); PDAcroForm acroForm = new PDAcroForm(document); acroForm.setNeedAppearances(true); document.getDocumentCatalog()

Using PdfBox, how do I retrieve contents of PDDocument as a byte array?

若如初见. 提交于 2019-12-18 04:34:13
问题 I am currently using PdfBox as the driver for a pdf-file editor application. I need the contents of the PdfBox representation of a pdf file (PDDocument) as a byte array. Does anyone know how to do this? 回答1: I hope it's not too late... ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); document.save(byteArrayOutputStream); document.close(); InputStream inputStream = new ByteArrayInputStream(byteArrayOutputStream.toByteArray()); And voila! You've got both input streams!

Comparison of two pdf files

狂风中的少年 提交于 2019-12-18 04:07:25
问题 I need to compare the contents of two almost similar files and highlight the dissimilar portions in the corresponding pdf file. Am using pdfbox. Please help me atleast with the logic. 回答1: If you prefer a tool with a GUI, you could try this one: diffpdf. It's by Mark Summerfield, and since it's written with Qt, it should be available (or should be buildable) on all platforms where Qt runs on. Here's a screenshot: 回答2: You can do the same thing with a shell script on Linux. The script wraps 3

PDFBox API: How to change font to handle Cyrillic values in an AcroForm field

房东的猫 提交于 2019-12-17 20:56:37
问题 I need help with adding Cyrillic value to a field using the PDFBox API . Here is what I have so far: PDDocument document = PDDocument.load(file); PDDocumentCatalog dc = document.getDocumentCatalog(); PDAcroForm acroForm = dc.getAcroForm(); PDField naziv = acroForm.getField("naziv"); naziv.setValue("Наслов"); // this part right here naziv.setValue("Naslov"); // it works like this It works perfect when my input is in Latin Alphabet. But I need to handle Cyrillic inputs as well. How can I do it?

Tagged PDF with PDFBox

こ雲淡風輕ζ 提交于 2019-12-17 20:47:23
问题 Is it possible to create tagged PDF(PDF/UA) with PDFBox? It looks like PDFBox has an API for that (package org.apache.pdfbox.pdmodel.documentinterchange.taggedpdf ), but I can't find any tutorials or code examples. Using the code below, I generated a PDF file containing an image, and the screen reader NVDA (in my case) recognizes it and reads '... graphic Alternate Description'. However, the accessibility checker PAC 2 shows an error: 'Image object not tagged'. PDDocument doc = new PDDocument

Converting PDF to image (with proper formatting)

﹥>﹥吖頭↗ 提交于 2019-12-17 20:38:59
问题 i have a pdf file(attached). My objective is to convert a pdf to an image using pdfbox AS IT IS,(same as using snipping tool in windows). The pdf has all kinds of shapes and text . i am using the following code: PDDocument doc = PDDocument.load("Hello World.pdf"); PDPage firstPage = (PDPage) doc.getDocumentCatalog().getAllPages().get(67); BufferedImage bufferedImage = firstPage.convertToImage(imageType,screenResolution); ImageIO.write(bufferedImage, "png",new File("out.png")); when i use the