pdfbox | 易学教程

How open and replace a data from PDF stream in the apache PDFBox lib in java?

阅读更多关于 How open and replace a data from PDF stream in the apache PDFBox lib in java?

问题 I use apache pdfbox 2.0.0 version in my java code (java 1.6). I'm trying to figure out how I can get, replace and save back to my pdf a data from <stream> data here... <endstream> ? My pdf file looks like: 596 0 obj << /Filter /FlateDecode /Length 3739 >> stream xњ[ЫnЬF}џoШ8эІАђhЮ/‰`@С%Hvќd-н“іXPJГ ... endstream endobj I've found a solution how I can decode this stream. I used a "WriteDecodedDoc" command from the pdfbox-app-1.8.10.jar api. So now I have two variant of the file but I have NO

How to read the current page number of the pdf document using pdfbox

阅读更多关于 How to read the current page number of the pdf document using pdfbox

问题 The page numbers in a PDF come in different variations, some PDFs have initial pages as roman numbers like I, ii, and later the page numbers are 1,2,... . I found a function in the pdfbox to get the desired page page.get(pagenumber) . But the problem with this function is that when I write get(1) , it returns the first page of the document (which may be numbered as ii and not the page with page number 2). Is there any way to obtain the page whose page number in the PDF is say 2 and not the

Disable Printing Issue with PDF Box

阅读更多关于 Disable Printing Issue with PDF Box

问题 I am using this sample PDFBox code to encrypt and disable printing of a pdf file. Encryption happens successfully, but printing is not disabled. What could be the issue? Here's the dependencies section of my pom.xml <dependencies> <dependency> <groupId>org.apache.pdfbox</groupId> <artifactId>pdfbox</artifactId> <version>2.0.6</version> </dependency> <dependency> <groupId>org.bouncycastle</groupId> <artifactId>bcprov-jdk15</artifactId> <version>1.46</version> </dependency> </dependencies> and

Read PDF in selenium: The constructor PDFParser(BufferedInputStream) is undefined

阅读更多关于 Read PDF in selenium: The constructor PDFParser(BufferedInputStream) is undefined

问题 I am getting error The constructor PDFParser(BufferedInputStream) is undefined I am trying to read PDF contents using Selenium. WebDriver driver=new FirefoxDriver(); driver.get("http://www.axmag.com/download/pdfurl-guide.pdf"); URL TestURL = new URL("http://www.axmag.com/download/pdfurl-guide.pdf"); BufferedInputStream TestFile = new BufferedInputStream(TestURL.openStream()); PDFParser TestPDF = new PDFParser(TestFile); TestPDF.parse(); String TestText = new PDFTextStripper().getText(TestPDF

How to disable PDFBox warn logging

阅读更多关于 How to disable PDFBox warn logging

问题 I have a simple java console application. pdfbox is utilized to extract text from PDF files. But there is continuous info printed in console: 十一月 29, 2017 9:28:27 下午 org.apache.pdfbox.pdmodel.font.PDSimpleFont toUnicode 警告: No Unicode mapping for 14 (145) in font GGNHDZ+SimSun 十一月 29, 2017 9:28:27 下午 org.apache.pdfbox.pdmodel.font.PDSimpleFont toUnicode 警告: No Unicode mapping for 28 (249) in font LNKLJH+SimSun 十一月 29, 2017 9:28:27 下午 org.apache.pdfbox.pdmodel.font.PDSimpleFont toUnicode I

PDFBox 2.0: Overcoming dictionary key encoding

阅读更多关于 PDFBox 2.0: Overcoming dictionary key encoding

问题 I am extracting text from PDF forms with Apache PDFBox 2.0.1, extracting the details of AcroForm fields. From a radio button field I dig up the appearance dictionary. I'm interested in the /N and /D entries (normal and "down" appearance). Like this (interactive Bean shell): field = form.getField(fieldName); widgets = field.getWidgets(); print("Field Name: " + field.getPartialName() + " (" + widgets.size() + ")"); for (annot : widgets) { ap = annot.getAppearance(); keys = ap.getCOSObject()

PDFBox - convert image to PDF, PDF resolution

阅读更多关于 PDFBox - convert image to PDF, PDF resolution

问题 I am using PDFBox v2 to convert jpg images to PDF. JPG image is already on the filesystem, so I just pick it up and convert it to PDF. Below is my code public void convertImgToPDF(String imagePath, String fileName, String destDir) throws IOException { PDDocument document = new PDDocument(); InputStream in = new FileInputStream(imagePath); BufferedImage bimg = ImageIO.read(in); float width = bimg.getWidth(); float height = bimg.getHeight(); PDPage page = new PDPage(new PDRectangle(width,

Placeholders for a text in a pdf Java-PDFBox?

阅读更多关于 Placeholders for a text in a pdf Java-PDFBox?

问题 Can we make placeholders for a text in a pdf and mark them with an id (similar to html tags) and just fill that placeholder with our text, of whichever length in Java, using PdfBox? 回答1: Can we make placeholders for a text in a pdf and mark them with an id (similar to html tags) and just fill that placeholder with our text, of whichever length No, at least not without a great deal of coding around it. The reason is that PDF is a format for documents with a finished layout. If you fill that

PDFbox Could not find font: /Helv

阅读更多关于 PDFbox Could not find font: /Helv

问题 I try to add form fields to existing PDF file but the following error appears PDFbox Could not find font: /Helv My code in Java has the following view: PDDocument pdf = PDDocument.load(inputStream); PDDocumentCatalog docCatalog = pdf.getDocumentCatalog(); PDAcroForm acroForm = docCatalog.getAcroForm(); PDPage page = pdf.getPage(0); PDTextField textBox = new PDTextField(acroForm); textBox.setPartialName("SampleField"); acroForm.getFields().add(textBox); PDAnnotationWidget widget = textBox

how to get field page in PDFBox API 2?

阅读更多关于 how to get field page in PDFBox API 2?

问题 i'm trying to get the field page in my project, and i dont know how to get the page number for each field and field. i have this code: String formTemplate = "Template.pdf"; String filledForm = "filledForm.pdf"; PDDocument pdfDocument = PDDocument.load(new File(formTemplate)); PDAcroForm acroForm = pdfDocument.getDocumentCatalog().getAcroForm(); if (acroForm != null) { PDField field = acroForm.getField( "name" ); field.getAcroForm().setNeedAppearances(true); field.setValue("my name"); acroForm