pdfbox

pdfbox - sign landscape file error

霸气de小男生 提交于 2019-12-21 22:03:36
问题 I am using pdfbox-1.8.8 to do the signing function on PDF file. It works well with PDF file in portrait mode. But with landscape file, I have an issue It looks like the coordinate is wrong for the landscape file. Does anyone know what is wrong with the file ? Here is the link of pdf file Here is the code I used to sign public void signDetached(String inputFilePath, String outputFilePath, String signatureImagePath, Sign signProperties) { OutputStream outputStream = null; InputStream

PDFBox - Building the latest version for .NET using IKVM

99封情书 提交于 2019-12-21 21:31:29
问题 I would like to build the latest version of PDFBox (http://pdfbox.apache.org/userguide/dot_net.html) for use within my .NET project. I have no experience with Java whatsoever but I am using the steps defined here: http://www.ikvm.net/userguide/tutorial.html I am using the following versions: - IKVM (0.42.0.6) - PDFBox (1.2.1) JAR file The problem is that when I try to create the DLL a series of error messages are displayed - i.e. "java.lang.NoClassDefFoundError". I am facing the same problem

Extract unselectable content from PDF

与世无争的帅哥 提交于 2019-12-21 20:04:33
问题 I'm using Apache PDFBox to extract pages from PDF files and I can't find a way to extract content that is unselectable (either text or images). With content that is selectable from within the PDF files there is no problem. Note that the PDFs in question dont have any restrictions regarding copying content, at least from what I saw on the files's "Document Restrictions Summary": they all have "Content Copying" and "Content Copying for Accessbility" allowed! On the same PDF file there is

How to find pdf is portrait or landscape using PDFBOX Library in Java

痴心易碎 提交于 2019-12-21 14:52:31
问题 I am doing project in Java using PDFBOX-1.8.6 library (its compulsory to use). My Question is How can I check input pdf file have portrait or landscape orientation ? How to check/scan portrait or landscape orientation in PDF by its dimensions of each page if both are same? For example, both are in standard A4 size. You will be more clear by below picture. my Landscape - Portrait problem I just want to check its content is rotated or not. So How can I cope up with above problem ? 回答1: Assuming

How to find pdf is portrait or landscape using PDFBOX Library in Java

瘦欲@ 提交于 2019-12-21 14:52:11
问题 I am doing project in Java using PDFBOX-1.8.6 library (its compulsory to use). My Question is How can I check input pdf file have portrait or landscape orientation ? How to check/scan portrait or landscape orientation in PDF by its dimensions of each page if both are same? For example, both are in standard A4 size. You will be more clear by below picture. my Landscape - Portrait problem I just want to check its content is rotated or not. So How can I cope up with above problem ? 回答1: Assuming

Scaled image blurry in PDFBox

牧云@^-^@ 提交于 2019-12-21 09:34:47
问题 I'm trying to scaling an image with size = 2496 x 3512 into a PDF document. I'm using PDFBox to generate it but the scaled image ends up blurred. Here are some snippets: PDF Page size (A4) returned by page.findMediaBox().createDimension(): java.awt.Dimension[width=612,height=792] Then I calculate the scaled dimension based on the Page size vs. Image size which returns: java.awt.Dimension[width=562,height=792] I use the code below in order to calculate the scaled dimension: public static

How to add .png images to pdf using Apache PDFBox

孤街浪徒 提交于 2019-12-21 08:17:10
问题 When I try to draw png images using pdfBox, the pages remain blank. Is there any way to insert png images using pdfBox? public void createPDFFromImage( String inputFile, String image, String outputFile ) throws IOException, COSVisitorException { // the document PDDocument doc = null; try { doc = PDDocument.load( inputFile ); //we will add the image to the first page. PDPage page = (PDPage)doc.getDocumentCatalog().getAllPages().get( 0 ); PDXObjectImage ximage = null; if( image.toLowerCase()

Copy+pasting text from PDF results in garbage

二次信任 提交于 2019-12-21 07:35:13
问题 I am writing a Master's thesis - NLP system. I have one component - extractor. It is extracting a plain text from PDF files. There are a few PDF files that can not be extracted correctly. Extractor (PDFBox library) returns a string like this: "┤xDn║if|d├gDF"Ti&cD╬lh d FÁhis~n ╗xd f«"d┤ffih »h" or "10a61a91a22a25a3a27a17a23a20a8a13a14a61a25a17" I was checking each file that makes this extraction's problem and all these files' text also can not be copy-pasted from PDF Reader (Adobe Reader and

Text extraction from PDF using PDFBox 2.0

那年仲夏 提交于 2019-12-21 06:47:34
问题 I'm trying to use PDFBox 2.0 for text extraction. I would like to get information on the font size of specific characters and the position rectangle of that character on the page. I've implemented this in PDFBox 1.6 using a PDFTextStripper: PDFParser parser = new PDFParser(is); try{ parser.parse(); }catch(IOException e){ } COSDocument cosDoc = parser.getDocument(); PDDocument pdd = new PDDocument(cosDoc); final StringBuffer extractedText = new StringBuffer(); PDFTextStripper textStripper =

Text extraction from PDF using PDFBox 2.0

女生的网名这么多〃 提交于 2019-12-21 06:46:06
问题 I'm trying to use PDFBox 2.0 for text extraction. I would like to get information on the font size of specific characters and the position rectangle of that character on the page. I've implemented this in PDFBox 1.6 using a PDFTextStripper: PDFParser parser = new PDFParser(is); try{ parser.parse(); }catch(IOException e){ } COSDocument cosDoc = parser.getDocument(); PDDocument pdd = new PDDocument(cosDoc); final StringBuffer extractedText = new StringBuffer(); PDFTextStripper textStripper =