pdf | 易学教程

Python3: Download PDF to memory and convert first page to image

阅读更多关于 Python3: Download PDF to memory and convert first page to image

问题 i try to do the following: Download a PDF file to memory Convert the first page to an image Use that image with tweepy I tried the following code, but run into an error. from PIL import Image from pdf2image import convert_from_path from urllib.request import urlopen from io import StringIO, BytesIO url = 'http://somedomain.com/assets/applets/internet.pdf' scrape = urlopen(url) # for external files pdfFile = BytesIO(scrape.read()) pdfFile.seek(0) pages = convert_from_path(pdfFile,last_page=1,

PDF-A1a document not valid after signing with VisualRepresentation using IText

阅读更多关于 PDF-A1a document not valid after signing with VisualRepresentation using IText

问题 I digitally sign a PDF-A1a document using IText 7.15.0. In addition to the digital signature, I also add a visual representation (image) to the document. PdfSignatureAppearance appearance = signer.GetSignatureAppearance(); appearance.SetPageNumber(1); Rectangle pr = new Rectangle(10 + ImageOffset, 10 + ImageOffset, 100, 100 ); appearance.SetPageRect(pr); byte[] image = System.IO.File.ReadAllBytes(VisualAppearance); appearance.SetRenderingMode(PdfSignatureAppearance.RenderingMode.GRAPHIC);

PDF-A1a document not valid after signing with VisualRepresentation using IText

阅读更多关于 PDF-A1a document not valid after signing with VisualRepresentation using IText

Extract PDF Form Data Using JavaScript and write to CSV File

阅读更多关于 Extract PDF Form Data Using JavaScript and write to CSV File

问题 I have been given a PDF file with a form. The form is not formatted as a table. My requirement is to extract the form field values, and write them to a CSV file which can be imported into Excel. I have tried using the automated "Merge data files to Spreadsheet" menu item in Acrobat Pro, but the output includes both the labels and form field values. I am interested in mostly just the form field values. I would like to use JavaScript to extract the form data, and instruct JavaScript how to

Extract PDF Form Data Using JavaScript and write to CSV File

阅读更多关于 Extract PDF Form Data Using JavaScript and write to CSV File

How to search a PDF (1.4) byte array for a target string?

阅读更多关于 How to search a PDF (1.4) byte array for a target string?

问题 I know this is probably a bit unusual, but I'd like to find out if a PDF document (a byte array) contains a particular piece of text. I create the docs myself in Java using the iText library v2.1.7, which produces docs compliant with the PDF 1.4 spec. My initial naive attempt was something like this: byte[] target = "the target text".getBytes("UTF-8"); int index = Bytes.indexOf(pdfBytes, target); // Guava lib System.out.println( index ); // always -1 (not found) I just don't understand enough

How to remove headers and footers from PDF file using iText in Java

阅读更多关于 How to remove headers and footers from PDF file using iText in Java

问题 I am using the PDF iText library to convert PDF to text. Below is my code to convert PDF to text file using Java. public class PdfConverter { /** The original PDF that will be parsed. */ public static final String pdfFileName = "jdbc_tutorial.pdf"; /** The resulting text file. */ public static final String RESULT = "preface.txt"; /** * Parses a PDF to a plain text file. * @param pdf the original PDF * @param txt the resulting text * @throws IOException */ public void parsePdf(String pdf,

Open blob files in Firefox not working

阅读更多关于 Open blob files in Firefox not working

问题 I would open files sent from server on Firefox. Actually it's working on IE. Here's how I proceed. openFile(path, fileName) { this.creditPoliciesService .openFile(path) .toPromise() .then(data => { var blob = new Blob([data.body], { type: "application/pdf" }); if (window.navigator && window.navigator.msSaveOrOpenBlob) { //if navigator is IE window.navigator.msSaveOrOpenBlob(blob, fileName); } else { // Mozilla case var fileURL = URL.createObjectURL(blob); //URL.createObjectURL takes only one

Expo - reduce PDF filesize on iOS devices

阅读更多关于 Expo - reduce PDF filesize on iOS devices

问题 I'm currently creating an A4 PDF within my Expo-App, using the "expo-print" API (printtofileasync). The PDF includes images (photos taken from the device) and some text. I've set the PDF size to 595 width, 842 height (A4 dimensions). Unfortunately the size of the PDF is too large for my requirements (1,9MB with only 1 image). I was able to reduce the PDF size on Android by decreasing the image size, but that does not work on iOS. I have the suspicion that on iOS Expo is simply "making

EOF marker not found while use PyPDF2 merge pdf file in python

阅读更多关于 EOF marker not found while use PyPDF2 merge pdf file in python

问题 When I use the following code from PyPDF2 import PdfFileMerger merge = PdfFileMerger() for newFile in nlst: merge.append(newFile) merge.write('newFile.pdf') Something happened as following: raise utils.PdfReadError("EOF marker not found") PyPDF2.utils.PdfReadError: EOF marker not found Anybody could tell me what happened? Thanks 回答1: PDF is a file format, where a pdf parser normally starts reading the file by reading some global information located at the end of the file. At the very end of