pdfbox

PDFBox Form fill - saveIncremental does not work

孤街醉人 提交于 2019-12-06 09:54:48
I have a pdf file with some form field that I want to fill from java. Right now I'm trying to fill just one form which I am finding by its name. My code looks like this: File file = new File("c:/Testy/luxmed/Skierowanie3.pdf"); PDDocument document = PDDocument.load(file); PDDocumentCatalog doc = document.getDocumentCatalog(); PDAcroForm Form = doc.getAcroForm(); String formName = "topmostSubform[0].Page1[0].pana_pania[0]"; PDField f = Form.getField(formName); setField(document, formName, "Artur"); System.out.println("New value 2nd: " + f.getValueAsString()); document.saveIncremental(new

PDFBox - Line / Rectangle extraction

人盡茶涼 提交于 2019-12-06 09:34:34
I am trying to extract text coordinates and line (or rectangle) coordinates from a PDF. The TextPosition class has getXDirAdj() and getYDirAdj() methods which transform coordinates according to the direction of the text piece the respective TextPosition object represents (Corrected based on comment from @mkl) The final output is consistent, irrespective of the page rotation. The coordinates needed on the output are X0,Y0 (TOP LEFT CORNER OF THE PAGE) This is a slight modification from the solution by @Tilman Hausherr. The y coordinates are inverted (height - y) to keep it consistent with the

Splitting a PDF results in very large PDF documents with PDFBox 2.0.2

放肆的年华 提交于 2019-12-06 08:13:04
I want to use command java -jar pdfbox-app-2.y.z.jar PDFSplit [OPTIONS] <PDF file> to split one PDF into many other PDFs. But I found that there was a problem: the PDF splited is "ActiveMQ In Action(Manning-2011).pdf" and it's 14.1MB. But when I run java -jar pdfbox-app-2.0.2.jar PDFSplit -split 5 -startPage 21 -endPage 40 -outputPrefix abc "ActiveMQ In Action(Manning-2011).pdf" every PDF is lager than 79MB! How can I prevent this? This is a known bug in PDFBox 2.0.2. Splitting works fine in 2.0.1, and will work fine again in 2.0.3. The "bad" code has already been reverted. The reasons for the

How can I get Images coordinates in pdf into JSONfile?

吃可爱长大的小学妹 提交于 2019-12-06 06:30:43
I have coded creating html page included images extracting a page in pdf document. I had tried to extract images from pdf and then I succeeded to extract images from pdf and to apply the images to html page using PDFBox lib. but I did not extract image coordinates in html page. So searched how to extract image coordinates in pdf, I tried to extract image coordinates in pdf using PDFBox Library. Below code : public static void main(String[] args) throws Exception { try { PDDocument document = PDDocument.load( "/Users/tmdtjq/Downloads/PDFTest/test.pdf" ); PrintImageLocations printer = new

How to get the content of PDF form text fields using pdfbox?

牧云@^-^@ 提交于 2019-12-06 06:09:19
I'm using this to get the text of a PDF file using org.apache.pdfbox File f = new File(fileName); if (!f.isFile()) { System.out.println("File " + fileName + " does not exist."); return null; } try { parser = new PDFParser(new FileInputStream(f)); } catch (Exception e) { System.out.println("Unable to open PDF Parser."); return null; } try { parser.parse(); cosDoc = parser.getDocument(); pdfStripper = new PDFTextStripper(); pdDoc = new PDDocument(cosDoc); parsedText = pdfStripper.getText(pdDoc); } catch (Exception e) { e.printStackTrace(); } It works great for the PDFs I've used it on so far.

Finding javascript code in PDF using Apache PDFBox

北城以北 提交于 2019-12-06 04:06:07
My goal is to extract and process any JavasSript code that a PDF document might contain. By opening a PDF in editor I can see objects like this: 402 0 obj <</S/JavaScript/JS(\n\r\n /* Set day 25 */\r\n FormRouter_SetCurrentDate\("25"\);\r)>> endobj I am trying to use Apache PDFBox to accomplish this but so far with no luck. This line returns an empty list: jsObj = doc.getObjectsByType(COSName.JAVA_SCRIPT); Can anyone can give me some direction? This tool is based on the PrintFields example in PDFBox. It will show the Javascript fields in forms. I wrote it last year for a guy who had problems

java pdfbox printerjob wrong scaling / page format

别说谁变了你拦得住时间么 提交于 2019-12-06 03:48:20
I'm trying to print an existing pdf file with pdfbox. Currently I'm using pdfbox 2.0.0 RC3 through maven. This is my current code: PDDocument document = PDDocument.load(new File(myPdfFile)); PrinterJob job = PrinterJob.getPrinterJob(); if (job.printDialog()) { job.setPageable(new PDFPageable(document)); job.print(); } document.close(); For testing I printed a test pdf with Adobe Acrobat and the same pdf with the few lines of code. Everything works fine except for the borders. All borders (header, footer, left & right side) are to small and the footer is way too small. Is there a magic method

Java PDFBox - Reading and modifying a pdf with special characters (diacritics)

情到浓时终转凉″ 提交于 2019-12-06 00:20:33
问题 i'm trying to modify a pdf using this method (first code block - using PDFStreamParser and iterating through PDFOperator, then updating COSString when needed): http://www.coderanch.com/t/556009/open-source/PdfBox-Replace-String-double-pdf I'm having an issue with some UTF-8 characters (diacritics): when I print the text that i want to update it show like "Societ? ?ii Na?ionale" (where '?' is a code like 0002 or 0004). The funny things are: when I write the updated pdf file, the characters are

Using pdfbox in java to overlay text onto previously created pdf document

流过昼夜 提交于 2019-12-05 22:51:37
I already have several PDF documents that have been created. What I am attempting to do is by using PDFBox. I need to put text into several places on these created documents but I do NOT want to modify the text that is within those areas. For instance, there may be a a section as follows - NAME: ______________________________ I will put text into that area, but I need the underline to remain the same length. I believe the best solution would be to just create a textbox or similar that goes above the area so the line remains the same length. In other words, I do not want to edit the text inline

Java close PDF error

丶灬走出姿态 提交于 2019-12-05 22:35:32
问题 I have this java code: try { PDFTextStripper pdfs = new PDFTextStripper(); String textOfPDF = pdfs.getText(PDDocument.load("doc")); doc.add(new Field(campo.getDestino(), textOfPDF, Field.Store.NO, Field.Index.ANALYZED)); } catch (Exception exep) { System.out.println(exep); System.out.println("PDF fail"); } And throws this: 11:45:07,017 WARN [COSDocument] Warning: You did not close a PDF Document And I don't know why but throw this 1, 2, 3, or more. I find that COSDocument is a class and have