pdfbox | 易学教程

Is there a way to create Bookmarks for pdf documents in PDFBOX?

阅读更多关于 Is there a way to create Bookmarks for pdf documents in PDFBOX?

问题 I am currently generating a large document from a database scheme and especially in bigger databases the amount of pages quickly exceeds 1000 pages... I use PDFBox to create the document and I am wondering whether PDFBox supports any way of creating Bookmarks that are displayed in the left side of Acrobat when viewing the document. On the website (and the related documentation) I haven't found anything helpful so far... Thanks in advance! 来源： https://stackoverflow.com/questions/24954281/is

Reading text of a pdf using PDFBOX occasionally returns \r\n

阅读更多关于 Reading text of a pdf using PDFBOX occasionally returns \r\n

问题 I’m currently using PDFBox to read the text of a set of pdfs that I’ve inherited. I’m only interested in reading the text, not making any changes to the file. The code that works for most of the files is: File pdfFile = myPath.toFile(); PDDocument document = PDDocument.load(pdfFile ); Writer sw = new StringWriter(); PDFTextStripper stripper = new PDFTextStripper(); stripper.setStartPage( 1 ); stripper.writeText( document, sw ); String documentText = sw.toString() For most files, I wind up

How to underlay a content stream with using PDPageContentStream?

阅读更多关于 How to underlay a content stream with using PDPageContentStream?

问题 I am trying to create a watermark with using PDPageContentStream. This is what I have right now PDPageContentStream contentStream = new PDPageContentStream(doc,page, true,true); contentStream.beginText(); contentStream.setFont(font,40); contentStream.setTextRotation(Math.PI/4,page.getMediaBox().getWidth()/4,page.getMediaBox().getHeight()/4); contentStream.setNonStrokingColor(210,210,210); //light grey contentStream.drawString(_text); contentStream.endText(); contentStream.close(); What

Java- Does pdfBox have an option to open file instead of loading it?

阅读更多关于 Java- Does pdfBox have an option to open file instead of loading it?

问题 I am using PDFBox in Java to attempt to extract text from the pdf file. This is how I load the file: PDDocument document = PDDocument.load(new File(path1)); As you can see, it opens the file and loads the stuff inside it. This may cause issue when say I tried to load a file which has 10 million words or text which is huge and it throws an OutOfMemoryException:Java heap space . I actually tested this and it does throw an error. And the culprit was the line above. Is there a way to open the

pdfBox - contentStream.concatenate2CTM full documentation parameters

阅读更多关于 pdfBox - contentStream.concatenate2CTM full documentation parameters

问题 jsf 2.1 / pdfbox Im tring to generate landscape pdfbox pdf and draw some strings to its contents but I didnt find any full specification about concatenate2CTM method. Does anyone have some full information about the concatenate 2CTM parameters I have only this but does not help me because I dont know what value I must enter. what means a...f operator ??? 回答1: This directly adds a cm operation to the content stream in question. Thus, you find those values a..f specified in the PDF

PDFBox 1.8 PrintTextLocations wrong TextPosition height for a multi page pdf

阅读更多关于 PDFBox 1.8 PrintTextLocations wrong TextPosition height for a multi page pdf

问题 I am running the example provided with PDFBox to get the width/height of each TextPosition. When I pass a one page pdf it gives me accurate results. But if I use a multi page pdf I am getting incorrect height. This is the experiment I did, I took a 5 page pdf and passed in as argument (got wrong height for each TextPosition). Next I split the same pdf into 5 single page pdfs using MacOSX Preview and passed each page one by one (I get correct height). package printtextlocations; import java.io

Extract footer data of PDF in java

阅读更多关于 Extract footer data of PDF in java

问题 I am able to get data from pdf pages in a string. But along with those, footer data is also extracted. I want to remove those from all the pages of pdf. How can I remove that I used Rectangle2D but coordinates are not giving data 回答1: In a comment the OP indicated that he used this code: PDDocument doc = PDDocument.load("xyz.pdf"); PDPage page = (PDPage)doc.getDocumentCatalog().getAllPages().get( 1 ); Rectangle2D region = new Rectangle2D.Double(10, 10, 10, 10); String regionName = "region";

Some glyph ID's missing while trying to extract glyph ID from pdf

阅读更多关于 Some glyph ID's missing while trying to extract glyph ID from pdf

问题 Due to Devanagiri glyph mapping to unicode character not being correct, I used the following code to extract the glyph ID and formed my own map to map ID's to proper unicode character. public class ExtractCharacterCodes { public static void testExtractFromSingNepChar() throws IOException { PDDocument document = PDDocument.load(new File("C:/PageSeparator/pattern3.pdf")); PDFTextStripper stripper = new PDFTextStripper() { @Override protected void writeString(String text, List<TextPosition>

PDFBox 2.x detect document changed after signing

阅读更多关于 PDFBox 2.x detect document changed after signing

问题 I'm trying to figure out how to detect if a document has been changed after it has been signed. I can't seem to find a good solution of this. Anyone know about this? EDIT Did some additional testing using only the " ShowSignature.java ". Here is what I found so far. If I change the document through PDFBox, both Adobe Reader & PDFBox will detect the broken signature. If I change the document with an Adobe product (Adobe Illustrator in this case) Adobe will report signature as broken, "

Opening a content stream blanks saved content?

阅读更多关于 Opening a content stream blanks saved content?

问题 I am trying to modify an existing PDF by adding some text to the header of each page. But even the simple sample code I have below ends up generating me a blank PDF as output: document = PDDocument.load(new File("c:/tmp/pdfbox_test_in.pdf")); PDPage page = (PDPage) document.getDocumentCatalog().getAllPages().get(0); PDPageContentStream contentStream = new PDPageContentStream(document, page); /* contentStream.beginText(); contentStream.setFont(font, 12); contentStream.moveTextPositionByAmount