ghostscript

Ghostscript convert a PDF and output in a textfile

家住魔仙堡 提交于 2020-01-01 05:41:45
问题 1.I need to convert a PDF File into a txt.file. My Command seems to work, since i get the converted text on the screen, but somehow im incapable to direct the output into a textfile. public static string[] GetArgs(string inputPath, string outputPath) { return new[] { "-q", "-dNODISPLAY", "-dSAFER", "-dDELAYBIND", "-dWRITESYSTEMDICT", "-dSIMPLE", "-c", "save", "-f", "ps2ascii.ps", inputPath, "-sDEVICE=txtwrite", String.Format("-sOutputFile={0}", outputPath), "-c", "quit" }; } 2.Is there a

Prevent Ghostscript from writing errors to standard output

谁都会走 提交于 2020-01-01 05:40:07
问题 I'm using Ghostscript to rasterize the first page of a PDF file to JPEG. To avoid creating tempfiles, the PDF data is piped into Ghoscripts's stdin and the JPEG is "drained" on stdout. This pipeline works like a charm until GS receives invalid PDF data: Instead of reporting all error messages on stderr as I would have expected, it still writes some of the messages to stdout instead. To reproduce: $ echo "Not a PDF" >test.txt $ /usr/bin/gs -q -sDEVICE=jpeg -dBATCH -dNOPAUSE -dFirstPage=1

Scale pdf to add border for printing full size pages

被刻印的时光 ゝ 提交于 2020-01-01 05:39:23
问题 When printing a pdf with no border (or margins), the printer choppes off around 1mm of the image data at the edges of the paper. I am therefore looking for a solution to scale/resize a pdf page slightly on the page to add a white border at the edges that will correspond with the white space at the edges produced by the printer. I have tried using gs so far.. For instance, suppose i have an A4 size pdf 1.pdf , then I used: gs -sDEVICE=pdfwrite \ -q -dBATCH -dNOPAUSE \ -dPDFFitPage \ -r300x300

Splitting a PDF with Ghostscript

安稳与你 提交于 2019-12-31 17:50:25
问题 I try to split a multipage PDF with Ghostscript, and I found the same solution on more sites and even on ghostscript.com, namely: gs -sDEVICE=pdfwrite -dSAFER -o outname.%d.pdf input.pdf But it seems not working for me, because it produces one file, with all pages, and with the name outname.1.pdf . When I add the start and end pages, then it is working fine, but I want it to work without knowing those parameters. In the gs-devel archive, I found a solution for this: http://ghostscript.com

Convert scanned pdf to text python

爱⌒轻易说出口 提交于 2019-12-31 12:12:29
问题 I have a scanned pdf file and I try to extract text from it. I tried to use pypdfocr to make ocr on it but I have error: "could not found ghostscript in the usual place" After searching I found this solution Linking Ghostscript to pypdfocr in Windows Platform and I tried to download GhostScript and put it in environment variable but it still has the same error. How can I searh text in my scanned pdf file using python? Thanks. Edit : here is my code sample: import os import sys import re

Re-encoding only images of a PDF? (or, ghostscript fails on 8-bit RGB while optimizing)

妖精的绣舞 提交于 2019-12-31 06:23:08
问题 I need to optimize a number of big PDF documents for file size, so I tried using ghostscript , invoked like this: gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dBATCH -sOutputFile=output-my-doc.pdf input-my-doc.pdf I can see this running for some pages, but then on particular pages it crashes. I updated to gs version 9.02, and I experience the same. After bursting the document into separate pages, and running the command above on each page, I could confirm

How can i use ghost4j on OS X 10.9

馋奶兔 提交于 2019-12-31 05:12:28
问题 When i want to use ghost4j on OS X 10.9, i see this error: Unable to load library 'gs': dlopen(libgs.dylib, 9): image not found I have installed ghostscript library on my macbook using this site. how can i fix this problem. I can not install ghostscript using port and brew for some reason. 回答1: First you need to find the file libgs.dylib which was installed by the installer package compile the libgs.dylib library from source, and make a note of where you installed it to. Hopefully it should

Create a tiff with only text and no images from a postscript file with ghostscript

喜你入骨 提交于 2019-12-31 04:58:06
问题 Is it possible to create a tiff file from a postscript-file (created from a pdf-document with readable text and images) into a tiff file without the images and only the text? Like add a maxbuffer so images will be removed and only text remaining? And if boxes and lines around text could be removed as well that would be awesome. Best regards! 回答1: You can redefine the various 'image' operators so that they don't do anything: /image { type /dicttype eq not { % uses up argument, only one if dict

Can Ghostscript currently convert a PDF to PDF/X?

不打扰是莪最后的温柔 提交于 2019-12-30 13:27:52
问题 The print house requires my dissertation's PDF to be compliant with PDF/X1a:2001. The content file was compiled using XeTeX LaTeX and the second PDF is the cover design done with Inkscape 0.48 . The nearest answer I found in this post: https://stackoverflow.com/a/3483801/1288722, and if I rightly understood, this can be done at least to convert the PDF to PDF/X using Ghostscript. As stated in the answer above, conversion to PDF/X requires a valid ICC profile. I contacted the printing house

Ghostscript command line parameters to convert EPS to PDF

南楼画角 提交于 2019-12-29 14:15:29
问题 Just installed Ghostscript 8.54 for Windows. Does anyone know of the minimum parameters to pass to gswin32c.exe to make it convert, say, someFile.eps to someFile.eps.pdf ? 回答1: Since the question was about the "minimum parameters to pass to gswin32c.exe to make it convert, say, someFile.eps to someFile.eps.pdf" , let me give an answer: c:/path/to/gswin32c.exe ^ -sDEVICE=pdfwrite ^ -o c:/path/to/output.pdf ^ c:/path/to/input.eps or even shorter: gswin32c ^ -sDEVICE=pdfwrite ^ -o output.pdf ^