pdf

Python3: Download PDF to memory and convert first page to image

别等时光非礼了梦想. 提交于 2021-02-10 05:51:28
问题 i try to do the following: Download a PDF file to memory Convert the first page to an image Use that image with tweepy I tried the following code, but run into an error. from PIL import Image from pdf2image import convert_from_path from urllib.request import urlopen from io import StringIO, BytesIO url = 'http://somedomain.com/assets/applets/internet.pdf' scrape = urlopen(url) # for external files pdfFile = BytesIO(scrape.read()) pdfFile.seek(0) pages = convert_from_path(pdfFile,last_page=1,

PDF-A1a document not valid after signing with VisualRepresentation using IText

醉酒当歌 提交于 2021-02-10 05:12:57
问题 I digitally sign a PDF-A1a document using IText 7.15.0. In addition to the digital signature, I also add a visual representation (image) to the document. PdfSignatureAppearance appearance = signer.GetSignatureAppearance(); appearance.SetPageNumber(1); Rectangle pr = new Rectangle(10 + ImageOffset, 10 + ImageOffset, 100, 100 ); appearance.SetPageRect(pr); byte[] image = System.IO.File.ReadAllBytes(VisualAppearance); appearance.SetRenderingMode(PdfSignatureAppearance.RenderingMode.GRAPHIC);

PDF-A1a document not valid after signing with VisualRepresentation using IText

[亡魂溺海] 提交于 2021-02-10 05:11:44
问题 I digitally sign a PDF-A1a document using IText 7.15.0. In addition to the digital signature, I also add a visual representation (image) to the document. PdfSignatureAppearance appearance = signer.GetSignatureAppearance(); appearance.SetPageNumber(1); Rectangle pr = new Rectangle(10 + ImageOffset, 10 + ImageOffset, 100, 100 ); appearance.SetPageRect(pr); byte[] image = System.IO.File.ReadAllBytes(VisualAppearance); appearance.SetRenderingMode(PdfSignatureAppearance.RenderingMode.GRAPHIC);

Extract PDF Form Data Using JavaScript and write to CSV File

心不动则不痛 提交于 2021-02-10 04:15:53
问题 I have been given a PDF file with a form. The form is not formatted as a table. My requirement is to extract the form field values, and write them to a CSV file which can be imported into Excel. I have tried using the automated "Merge data files to Spreadsheet" menu item in Acrobat Pro, but the output includes both the labels and form field values. I am interested in mostly just the form field values. I would like to use JavaScript to extract the form data, and instruct JavaScript how to

Extract PDF Form Data Using JavaScript and write to CSV File

泄露秘密 提交于 2021-02-10 04:14:08
问题 I have been given a PDF file with a form. The form is not formatted as a table. My requirement is to extract the form field values, and write them to a CSV file which can be imported into Excel. I have tried using the automated "Merge data files to Spreadsheet" menu item in Acrobat Pro, but the output includes both the labels and form field values. I am interested in mostly just the form field values. I would like to use JavaScript to extract the form data, and instruct JavaScript how to

How to search a PDF (1.4) byte array for a target string?

霸气de小男生 提交于 2021-02-09 08:18:50
问题 I know this is probably a bit unusual, but I'd like to find out if a PDF document (a byte array) contains a particular piece of text. I create the docs myself in Java using the iText library v2.1.7, which produces docs compliant with the PDF 1.4 spec. My initial naive attempt was something like this: byte[] target = "the target text".getBytes("UTF-8"); int index = Bytes.indexOf(pdfBytes, target); // Guava lib System.out.println( index ); // always -1 (not found) I just don't understand enough

How to remove headers and footers from PDF file using iText in Java

折月煮酒 提交于 2021-02-09 05:37:44
问题 I am using the PDF iText library to convert PDF to text. Below is my code to convert PDF to text file using Java. public class PdfConverter { /** The original PDF that will be parsed. */ public static final String pdfFileName = "jdbc_tutorial.pdf"; /** The resulting text file. */ public static final String RESULT = "preface.txt"; /** * Parses a PDF to a plain text file. * @param pdf the original PDF * @param txt the resulting text * @throws IOException */ public void parsePdf(String pdf,

Open blob files in Firefox not working

爱⌒轻易说出口 提交于 2021-02-08 15:17:45
问题 I would open files sent from server on Firefox. Actually it's working on IE. Here's how I proceed. openFile(path, fileName) { this.creditPoliciesService .openFile(path) .toPromise() .then(data => { var blob = new Blob([data.body], { type: "application/pdf" }); if (window.navigator && window.navigator.msSaveOrOpenBlob) { //if navigator is IE window.navigator.msSaveOrOpenBlob(blob, fileName); } else { // Mozilla case var fileURL = URL.createObjectURL(blob); //URL.createObjectURL takes only one

Expo - reduce PDF filesize on iOS devices

你说的曾经没有我的故事 提交于 2021-02-08 13:24:06
问题 I'm currently creating an A4 PDF within my Expo-App, using the "expo-print" API (printtofileasync). The PDF includes images (photos taken from the device) and some text. I've set the PDF size to 595 width, 842 height (A4 dimensions). Unfortunately the size of the PDF is too large for my requirements (1,9MB with only 1 image). I was able to reduce the PDF size on Android by decreasing the image size, but that does not work on iOS. I have the suspicion that on iOS Expo is simply "making

EOF marker not found while use PyPDF2 merge pdf file in python

橙三吉。 提交于 2021-02-08 13:15:17
问题 When I use the following code from PyPDF2 import PdfFileMerger merge = PdfFileMerger() for newFile in nlst: merge.append(newFile) merge.write('newFile.pdf') Something happened as following: raise utils.PdfReadError("EOF marker not found") PyPDF2.utils.PdfReadError: EOF marker not found Anybody could tell me what happened? Thanks 回答1: PDF is a file format, where a pdf parser normally starts reading the file by reading some global information located at the end of the file. At the very end of