pdfbox | 易学教程

PDFBox Form fill - saveIncremental does not work

阅读更多关于 PDFBox Form fill - saveIncremental does not work

I have a pdf file with some form field that I want to fill from java. Right now I'm trying to fill just one form which I am finding by its name. My code looks like this: File file = new File("c:/Testy/luxmed/Skierowanie3.pdf"); PDDocument document = PDDocument.load(file); PDDocumentCatalog doc = document.getDocumentCatalog(); PDAcroForm Form = doc.getAcroForm(); String formName = "topmostSubform[0].Page1[0].pana_pania[0]"; PDField f = Form.getField(formName); setField(document, formName, "Artur"); System.out.println("New value 2nd: " + f.getValueAsString()); document.saveIncremental(new

PDFBox - Line / Rectangle extraction

阅读更多关于 PDFBox - Line / Rectangle extraction

I am trying to extract text coordinates and line (or rectangle) coordinates from a PDF. The TextPosition class has getXDirAdj() and getYDirAdj() methods which transform coordinates according to the direction of the text piece the respective TextPosition object represents (Corrected based on comment from @mkl) The final output is consistent, irrespective of the page rotation. The coordinates needed on the output are X0,Y0 (TOP LEFT CORNER OF THE PAGE) This is a slight modification from the solution by @Tilman Hausherr. The y coordinates are inverted (height - y) to keep it consistent with the

Splitting a PDF results in very large PDF documents with PDFBox 2.0.2

阅读更多关于 Splitting a PDF results in very large PDF documents with PDFBox 2.0.2

I want to use command java -jar pdfbox-app-2.y.z.jar PDFSplit [OPTIONS] <PDF file> to split one PDF into many other PDFs. But I found that there was a problem: the PDF splited is "ActiveMQ In Action(Manning-2011).pdf" and it's 14.1MB. But when I run java -jar pdfbox-app-2.0.2.jar PDFSplit -split 5 -startPage 21 -endPage 40 -outputPrefix abc "ActiveMQ In Action(Manning-2011).pdf" every PDF is lager than 79MB! How can I prevent this? This is a known bug in PDFBox 2.0.2. Splitting works fine in 2.0.1, and will work fine again in 2.0.3. The "bad" code has already been reverted. The reasons for the

How can I get Images coordinates in pdf into JSONfile?

阅读更多关于 How can I get Images coordinates in pdf into JSONfile?

I have coded creating html page included images extracting a page in pdf document. I had tried to extract images from pdf and then I succeeded to extract images from pdf and to apply the images to html page using PDFBox lib. but I did not extract image coordinates in html page. So searched how to extract image coordinates in pdf, I tried to extract image coordinates in pdf using PDFBox Library. Below code : public static void main(String[] args) throws Exception { try { PDDocument document = PDDocument.load( "/Users/tmdtjq/Downloads/PDFTest/test.pdf" ); PrintImageLocations printer = new

How to get the content of PDF form text fields using pdfbox?

阅读更多关于 How to get the content of PDF form text fields using pdfbox?

I'm using this to get the text of a PDF file using org.apache.pdfbox File f = new File(fileName); if (!f.isFile()) { System.out.println("File " + fileName + " does not exist."); return null; } try { parser = new PDFParser(new FileInputStream(f)); } catch (Exception e) { System.out.println("Unable to open PDF Parser."); return null; } try { parser.parse(); cosDoc = parser.getDocument(); pdfStripper = new PDFTextStripper(); pdDoc = new PDDocument(cosDoc); parsedText = pdfStripper.getText(pdDoc); } catch (Exception e) { e.printStackTrace(); } It works great for the PDFs I've used it on so far.

Finding javascript code in PDF using Apache PDFBox

阅读更多关于 Finding javascript code in PDF using Apache PDFBox

My goal is to extract and process any JavasSript code that a PDF document might contain. By opening a PDF in editor I can see objects like this: 402 0 obj <</S/JavaScript/JS(\n\r\n /* Set day 25 */\r\n FormRouter_SetCurrentDate\("25"\);\r)>> endobj I am trying to use Apache PDFBox to accomplish this but so far with no luck. This line returns an empty list: jsObj = doc.getObjectsByType(COSName.JAVA_SCRIPT); Can anyone can give me some direction? This tool is based on the PrintFields example in PDFBox. It will show the Javascript fields in forms. I wrote it last year for a guy who had problems

java pdfbox printerjob wrong scaling / page format

阅读更多关于 java pdfbox printerjob wrong scaling / page format

I'm trying to print an existing pdf file with pdfbox. Currently I'm using pdfbox 2.0.0 RC3 through maven. This is my current code: PDDocument document = PDDocument.load(new File(myPdfFile)); PrinterJob job = PrinterJob.getPrinterJob(); if (job.printDialog()) { job.setPageable(new PDFPageable(document)); job.print(); } document.close(); For testing I printed a test pdf with Adobe Acrobat and the same pdf with the few lines of code. Everything works fine except for the borders. All borders (header, footer, left & right side) are to small and the footer is way too small. Is there a magic method

Java PDFBox - Reading and modifying a pdf with special characters (diacritics)

阅读更多关于 Java PDFBox - Reading and modifying a pdf with special characters (diacritics)

问题 i'm trying to modify a pdf using this method (first code block - using PDFStreamParser and iterating through PDFOperator, then updating COSString when needed): http://www.coderanch.com/t/556009/open-source/PdfBox-Replace-String-double-pdf I'm having an issue with some UTF-8 characters (diacritics): when I print the text that i want to update it show like "Societ? ?ii Na?ionale" (where '?' is a code like 0002 or 0004). The funny things are: when I write the updated pdf file, the characters are

Using pdfbox in java to overlay text onto previously created pdf document

阅读更多关于 Using pdfbox in java to overlay text onto previously created pdf document

I already have several PDF documents that have been created. What I am attempting to do is by using PDFBox. I need to put text into several places on these created documents but I do NOT want to modify the text that is within those areas. For instance, there may be a a section as follows - NAME: ______________________________ I will put text into that area, but I need the underline to remain the same length. I believe the best solution would be to just create a textbox or similar that goes above the area so the line remains the same length. In other words, I do not want to edit the text inline

Java close PDF error

阅读更多关于 Java close PDF error

问题 I have this java code: try { PDFTextStripper pdfs = new PDFTextStripper(); String textOfPDF = pdfs.getText(PDDocument.load("doc")); doc.add(new Field(campo.getDestino(), textOfPDF, Field.Store.NO, Field.Index.ANALYZED)); } catch (Exception exep) { System.out.println(exep); System.out.println("PDF fail"); } And throws this: 11:45:07,017 WARN [COSDocument] Warning: You did not close a PDF Document And I don't know why but throw this 1, 2, 3, or more. I find that COSDocument is a class and have