pdfbox

How open and replace a data from PDF stream in the apache PDFBox lib in java?

☆樱花仙子☆ 提交于 2019-12-11 08:34:03
问题 I use apache pdfbox 2.0.0 version in my java code (java 1.6). I'm trying to figure out how I can get, replace and save back to my pdf a data from <stream> data here... <endstream> ? My pdf file looks like: 596 0 obj << /Filter /FlateDecode /Length 3739 >> stream xњ­[ЫnЬF}џoШ8эІАђhЮ/‰`@С%Hvќd-н“іXPJГ ... endstream endobj I've found a solution how I can decode this stream. I used a "WriteDecodedDoc" command from the pdfbox-app-1.8.10.jar api. So now I have two variant of the file but I have NO

How to read the current page number of the pdf document using pdfbox

我与影子孤独终老i 提交于 2019-12-11 08:33:28
问题 The page numbers in a PDF come in different variations, some PDFs have initial pages as roman numbers like I, ii, and later the page numbers are 1,2,... . I found a function in the pdfbox to get the desired page page.get(pagenumber) . But the problem with this function is that when I write get(1) , it returns the first page of the document (which may be numbered as ii and not the page with page number 2). Is there any way to obtain the page whose page number in the PDF is say 2 and not the

Disable Printing Issue with PDF Box

旧巷老猫 提交于 2019-12-11 08:05:26
问题 I am using this sample PDFBox code to encrypt and disable printing of a pdf file. Encryption happens successfully, but printing is not disabled. What could be the issue? Here's the dependencies section of my pom.xml <dependencies> <dependency> <groupId>org.apache.pdfbox</groupId> <artifactId>pdfbox</artifactId> <version>2.0.6</version> </dependency> <dependency> <groupId>org.bouncycastle</groupId> <artifactId>bcprov-jdk15</artifactId> <version>1.46</version> </dependency> </dependencies> and

Read PDF in selenium: The constructor PDFParser(BufferedInputStream) is undefined

China☆狼群 提交于 2019-12-11 06:41:56
问题 I am getting error The constructor PDFParser(BufferedInputStream) is undefined I am trying to read PDF contents using Selenium. WebDriver driver=new FirefoxDriver(); driver.get("http://www.axmag.com/download/pdfurl-guide.pdf"); URL TestURL = new URL("http://www.axmag.com/download/pdfurl-guide.pdf"); BufferedInputStream TestFile = new BufferedInputStream(TestURL.openStream()); PDFParser TestPDF = new PDFParser(TestFile); TestPDF.parse(); String TestText = new PDFTextStripper().getText(TestPDF

How to disable PDFBox warn logging

寵の児 提交于 2019-12-11 06:16:28
问题 I have a simple java console application. pdfbox is utilized to extract text from PDF files. But there is continuous info printed in console: 十一月 29, 2017 9:28:27 下午 org.apache.pdfbox.pdmodel.font.PDSimpleFont toUnicode 警告: No Unicode mapping for 14 (145) in font GGNHDZ+SimSun 十一月 29, 2017 9:28:27 下午 org.apache.pdfbox.pdmodel.font.PDSimpleFont toUnicode 警告: No Unicode mapping for 28 (249) in font LNKLJH+SimSun 十一月 29, 2017 9:28:27 下午 org.apache.pdfbox.pdmodel.font.PDSimpleFont toUnicode I

PDFBox 2.0: Overcoming dictionary key encoding

女生的网名这么多〃 提交于 2019-12-11 05:56:46
问题 I am extracting text from PDF forms with Apache PDFBox 2.0.1, extracting the details of AcroForm fields. From a radio button field I dig up the appearance dictionary. I'm interested in the /N and /D entries (normal and "down" appearance). Like this (interactive Bean shell): field = form.getField(fieldName); widgets = field.getWidgets(); print("Field Name: " + field.getPartialName() + " (" + widgets.size() + ")"); for (annot : widgets) { ap = annot.getAppearance(); keys = ap.getCOSObject()

PDFBox - convert image to PDF, PDF resolution

爷,独闯天下 提交于 2019-12-11 05:06:02
问题 I am using PDFBox v2 to convert jpg images to PDF. JPG image is already on the filesystem, so I just pick it up and convert it to PDF. Below is my code public void convertImgToPDF(String imagePath, String fileName, String destDir) throws IOException { PDDocument document = new PDDocument(); InputStream in = new FileInputStream(imagePath); BufferedImage bimg = ImageIO.read(in); float width = bimg.getWidth(); float height = bimg.getHeight(); PDPage page = new PDPage(new PDRectangle(width,

Placeholders for a text in a pdf Java-PDFBox?

喜你入骨 提交于 2019-12-11 05:02:53
问题 Can we make placeholders for a text in a pdf and mark them with an id (similar to html tags) and just fill that placeholder with our text, of whichever length in Java, using PdfBox? 回答1: Can we make placeholders for a text in a pdf and mark them with an id (similar to html tags) and just fill that placeholder with our text, of whichever length No, at least not without a great deal of coding around it. The reason is that PDF is a format for documents with a finished layout. If you fill that

PDFbox Could not find font: /Helv

那年仲夏 提交于 2019-12-11 04:10:47
问题 I try to add form fields to existing PDF file but the following error appears PDFbox Could not find font: /Helv My code in Java has the following view: PDDocument pdf = PDDocument.load(inputStream); PDDocumentCatalog docCatalog = pdf.getDocumentCatalog(); PDAcroForm acroForm = docCatalog.getAcroForm(); PDPage page = pdf.getPage(0); PDTextField textBox = new PDTextField(acroForm); textBox.setPartialName("SampleField"); acroForm.getFields().add(textBox); PDAnnotationWidget widget = textBox

how to get field page in PDFBox API 2?

廉价感情. 提交于 2019-12-11 04:03:46
问题 i'm trying to get the field page in my project, and i dont know how to get the page number for each field and field. i have this code: String formTemplate = "Template.pdf"; String filledForm = "filledForm.pdf"; PDDocument pdfDocument = PDDocument.load(new File(formTemplate)); PDAcroForm acroForm = pdfDocument.getDocumentCatalog().getAcroForm(); if (acroForm != null) { PDField field = acroForm.getField( "name" ); field.getAcroForm().setNeedAppearances(true); field.setValue("my name"); acroForm