问题
I have a 100 page PDF that is about 50 MBs. I am running the script below against it and it's taking about 23 seconds per page. The PDF is a scan of a paper document.
gswin32.exe -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.3
-dPDFSETTINGS=/screen -sOutputFile=out4.pdf 09.pdf
Is there anything I can do to speed this up? I've determined that the -dPDFSettings=/screen
is what is making it so slow, but i'm not getting good compression without it...
UPDATE:
OK I tried updating it to what I have below. Am i using the -c 30000000 setvmthreshold
portion correctly?
gswin32.exe -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.3
-dPDFSETTINGS=/screen -dNumRenderingThreads=2 -sOutputFile=out7.pdf
-c 30000000 setvmthreshold -f 09.pdf
回答1:
If you are on a multicore system, make it use multiple CPU cores with:
-dNumRenderingThreads=<number of cpus>
Let it use up to 30mb of RAM:
-c "30000000 setvmthreshold"
Try disabling the garbage collector:
-dNOGC
Fore more details, see Improving Performance section from Ghoscript docs.
回答2:
I was crunching a ~300
page PDF on a core i7
and found that adding the following options provided a significant speedup:
%-> comments to the right
-dNumRenderingThreads=8 % increasing up to 64 didn't make much difference
-dBandHeight=100 % didn't matter much
-dBandBufferSpace=500000000 % (500MB)
-sBandListStorage=memory % may or may not need to be set when gs is compiled
-dBufferSpace=1000000000 % (1GB)
The -c 1000000000 setnvmthreshold -f
thing didn't make much difference for me, FWIW.
回答3:
You don't say what CPU and what amount of RAM your computer is equipped with.
Your situation is this:
- A scanned document as PDF, sized about 500 kB per page on avarage. That means each page basically is a picture, using the scan resolution (at least 200 dpi, maybe even 600 dpi).
- You are re-distilling it with Ghostscript, using
-dPDFSETTINGS=/screen
. This setting will do quite a few things to make the file size smaller. Amongst the most important are:- Re-sample all (color or grayscale) images to 72dpi
- Convert all colors to sRGB
Both these operations can quite "expensive" in terms of CPU and/or RAM usage.
BTW, your setting of -dCompatibilityLevel=1.3
is not required; it's already implicitely set by -dPDFSETTINGS=/screen
already.
Try this:
gswin32.exe ^
-o output.pdf ^
-sDEVICE=pdfwrite ^
-dPDFSETTINGS=/screen ^
-dNumRenderingThreads=2 ^
-dMaxPatternBitmap=1000000 ^
-c "60000000 setvmthreshold" ^
-f input.pdf
Also, if you are on a 64bit system, try to install the most recent 32bit Ghostscript version (9.00). It performs better than the 64bit version.
Let me tell you that downsampling a 600dpi scanned page image to 72dpi usually does not take 23 seconds for me, but less than 1.
回答4:
I may be complete out of place here, but have you given a try to the Djvu file format ? It works like a charm for scanned documents in general (even if there are lots of pictures), and it gives much better compressed files: I get a factor of two lossless gain in size in general on B&W scientific articles.
来源:https://stackoverflow.com/questions/4548919/any-tips-for-speeding-up-ghostscript