pdfbox

PDFBox 2.0.7 ExtractText not working but 1.8.13 does and PDFReader as well

妖精的绣舞 提交于 2019-12-01 12:33:05
问题 hopefully you have an idea of what is going wrong with extracting a text from PDF using pdfbox 2.0.7. The result is very strange: Using 1.8.13, the command java -jar pdfbox-app-1.8.13.jar ExtractText -sort -nonSeq test.pdf leads to Deutsche Bank Privat- und Geschäftskunden AG Bruttoertrag 43,80 USD 37,15 EUR Kapitalertragsteuer (KESt) - 5,36 USD - 4,55 EUR Solidaritätszuschlag auf KESt - 0,29 USD - 0,25 EUR Umrechnungskurs USD zu EUR 1,1791000000 Gutschrift mit Wert 15.08.2017 32,35 EUR Using

Can't insert Tabs and Spaces into PDBox PDF document

杀马特。学长 韩版系。学妹 提交于 2019-12-01 11:37:06
I want to print this in a pdf created by PDFBOX. It wont let me insert tabs and spaces because the font does not support them. Why is this a problem, and more importantly, how can I fix it? StudentData student = listOfDebtors.get(j); contentStream.beginText(); contentStream.setFont(font, 8); contentStream.newLineAtOffset(xPosition, yPosition); contentStream.showText("Member #:"+ student.getMembershipNumber() + "\t" + "Grade:" + getStudentGradeInSchool(student.getYearGraduate()) + "\t" + "Year Joined" + student.getYearJoined() + "\n" + "Name:" + student.getFirstName() + " " + student

Saved Text Field value is not displayed properly in PDF generated using PDFBOX

瘦欲@ 提交于 2019-12-01 11:28:30
import java.io.IOException; import javax.swing.text.BadLocationException; import org.apache.pdfbox.cos.COSArray; import org.apache.pdfbox.cos.COSDictionary; import org.apache.pdfbox.cos.COSFloat; import org.apache.pdfbox.cos.COSName; import org.apache.pdfbox.cos.COSString; import org.apache.pdfbox.exceptions.COSVisitorException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.PDPage; import org.apache.pdfbox.pdmodel.interactive.action.PDAnnotationAdditionalActions; import org.apache.pdfbox.pdmodel.interactive.action.type.PDActionJavaScript; import org.apache

How to Check PDF is Reader enabled or not using C#?

让人想犯罪 __ 提交于 2019-12-01 10:44:50
My only requirement is to find a selected pdf in a folder is Reader enabled or not, more specifically if usage rights are defined in a way that allows people to add annotations (e.g. comments). I am doing this in windows application. If I click a button, an event is triggered searching a folder for PDF files. This event needs to check whether or not the PDFs in the folder are Reader enabled for comments. If they are, I need to remove the comment usage rights or revert the PDF back to its original version. My code can only find PDF files in the folder. I don`t know how to check if the selected

Calculate correct width of a text

孤街醉人 提交于 2019-12-01 10:43:38
I need to read a plan exported by AutoCAD to PDF and place some markers with text on it with PDFBox. Everything works fine, except the calculation of the width of the text, which is written next to the markers. I skimmed through the whole PDF specification and read in detail the parts, which deal with the graphic and the text, but to no avail. As far as I understand, the glyph coordinate space is set up in a 1/1000 of the user coordinate space. Hence the width need to be scale up by 1000, but it's still a fraction of the real width. This is what I am doing to position the text: float textWidth

Splitting a large Pdf file with PDFBox gets large result files

試著忘記壹切 提交于 2019-12-01 10:39:51
I am processing some large pdf files, (up to 100MB and about 2000 pages), with pdfbox. Some of the pages contain a QR code, I want to split those files into smaller ones with the pages from one QR code to the next. I got this, but the result file sizes are the same as the source file. I mean, if I cut a 100MB pdf file into a ten files I am getting ten files 100MB each. This is the code: PDDocument documentoPdf = PDDocument.loadNonSeq(new File("myFile.pdf"), new RandomAccessFile(new File("./tmp/temp"), "rw")); int numPages = documentoPdf.getNumberOfPages(); List pages = documentoPdf

How to draw a string at a specific position on a pdf page in java using pdfbox?

一世执手 提交于 2019-12-01 10:16:24
问题 I have a pdf coordinate (x, y) as input . I need to draw a string at the given input coordinate[Eg :- (x,y)=(200,250)]. I am using pdfbox , When I am using the below method moveTextPositionByAmount I am not getting the exact position.Even i have tried with moveTo(). Please help me how to draw the string at an exact position ? PDPageContentStream contentStream = new PDPageContentStream(document, page,true,true); contentStream.beginText(); contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);

Calculate correct width of a text

烈酒焚心 提交于 2019-12-01 08:16:08
问题 I need to read a plan exported by AutoCAD to PDF and place some markers with text on it with PDFBox. Everything works fine, except the calculation of the width of the text, which is written next to the markers. I skimmed through the whole PDF specification and read in detail the parts, which deal with the graphic and the text, but to no avail. As far as I understand, the glyph coordinate space is set up in a 1/1000 of the user coordinate space. Hence the width need to be scale up by 1000, but

How to Check PDF is Reader enabled or not using C#?

天涯浪子 提交于 2019-12-01 08:02:26
问题 My only requirement is to find a selected pdf in a folder is Reader enabled or not, more specifically if usage rights are defined in a way that allows people to add annotations (e.g. comments). I am doing this in windows application. If I click a button, an event is triggered searching a folder for PDF files. This event needs to check whether or not the PDFs in the folder are Reader enabled for comments. If they are, I need to remove the comment usage rights or revert the PDF back to its

Saved Text Field value is not displayed properly in PDF generated using PDFBOX

一世执手 提交于 2019-12-01 08:00:41
问题 import java.io.IOException; import javax.swing.text.BadLocationException; import org.apache.pdfbox.cos.COSArray; import org.apache.pdfbox.cos.COSDictionary; import org.apache.pdfbox.cos.COSFloat; import org.apache.pdfbox.cos.COSName; import org.apache.pdfbox.cos.COSString; import org.apache.pdfbox.exceptions.COSVisitorException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.PDPage; import org.apache.pdfbox.pdmodel.interactive.action