Splitting a PDF with Ghostscript

安稳与你 提交于 2019-12-31 17:50:25

问题


I try to split a multipage PDF with Ghostscript, and I found the same solution on more sites and even on ghostscript.com, namely:

gs -sDEVICE=pdfwrite -dSAFER -o outname.%d.pdf input.pdf

But it seems not working for me, because it produces one file, with all pages, and with the name outname.1.pdf.

When I add the start and end pages, then it is working fine, but I want it to work without knowing those parameters.

In the gs-devel archive, I found a solution for this: http://ghostscript.com/pipermail/gs-devel/2009-April/008310.html -- but I feel like doing it without pdf_info.

When I use a different device, for example pswrite, but same parameters, it works correctly, producing as many ps files, as my input.pdf contains.

Is this normal when using pdfwrite? Am I doing something wrong?


回答1:


What you see is "normal" behaviour: the current version of Ghostscript's pdfwrite output device does not support this feature. This is also (admittedly, somehow vaguely) documented in Use.htm:

"Note, however that the one page per file feature may not be supported by all devices...."

I seem to remember that one of the Ghostscript developers mentioned on IRC that they may add this feature to pdfwrite in some future release, but it seems to necessitate some major code rewrite, which is why they haven't done it yet...


Update: As Gordon's comment already hinted at, as of version 9.06 (released on July 31st, 2012), Ghostscript now supports the commandline as quoted in the question also for pdfwrite. (Gordon must have discovered the unofficial support for this already in 9.05, or he compiled his own executable from the pre-release sources which were not yet tagged as 9.06).




回答2:


I found this script wriiten by Mr Weimer super useful:

#!/bin/sh
#
# pdfsplit [input.pdf] [first_page] [last_page] [output.pdf] 
#
# Example: pdfsplit big_file.pdf 10 20 pages_ten_to_twenty.pdf
#
# written by: Westley Weimer, Wed Mar 19 17:58:09 EDT 2008
#
# The trick: ghostscript (gs) will do PDF splitting for you, it's just not
# obvious and the required defines are not listed in the manual page. 

if [ $# -lt 4 ] 
then
        echo "Usage: pdfsplit input.pdf first_page last_page output.pdf"
        exit 1
fi
yes | gs -dBATCH -sOutputFile="$4" -dFirstPage=$2 -dLastPage=$3 -sDEVICE=pdfwrite "$1" >& /dev/null

Origin from : http://www.cs.virginia.edu/~weimer/pdfsplit/pdfsplit

save it as pdfsplit.sh, see the magic happens.

PDFSAM also could do the job. Available on Windows and Mac.




回答3:


 #!/bin/bash
#where $1 is the input filename

ournum=`gs -q -dNODISPLAY -c "("$1") (r) file runpdfbegin pdfpagecount = quit" 2>/dev/null`
echo "Processing $ournum pages"
counter=1
while [ $counter -le $ournum ] ; do
    newname=`echo $1 | sed -e s/\.pdf//g`
    reallynewname=$newname-$counter.pdf
    counterplus=$((counter+1))
    # make the individual pdf page
    yes | gs -dBATCH -sOutputFile="$reallynewname" -dFirstPage=$counter -dLastPage=$counter -sDEVICE=pdfwrite "$1" >& /dev/null
    counter=$counterplus
done



回答4:


Here is a script for Windows command prompt (working also with drag and drop) assuming you have Ghostscript installed:

@echo off
chcp 65001
setlocal enabledelayedexpansion

rem Customize or remove this line if you already have Ghostscript folders in your system PATH
set path=C:\Program Files\gs\gs9.22\lib;C:\Program Files\gs\gs9.22\bin;%path%

:start

echo Splitting "%~n1%~x1" into standalone single pages...
cd %~d1%~p1
rem getting number of pages of PDF with GhostScript
for /f "usebackq delims=" %%a in (`gswin64c -q -dNODISPLAY -c "(%~n1%~x1) (r) file runpdfbegin pdfpagecount = quit"`) do set "numpages=%%a"

for /L %%n in (1,1,%numpages%) do (
echo Extracting page %%n of %numpages%...
set "x=00%%n"
set "x=!x:~-3!"
gswin64c.exe -dNumRenderingThreads=2 -dBATCH -dNOPAUSE -dQUIET -dFirstPage=%%n -dLastPage=%%n -sDEVICE=pdfwrite -sOutputFile="%~d1%~p1%~n1-!x!.pdf" "%1"
)

shift
if NOT x%1==x goto start

pause

Name this script something like split PDF.bat and put it on your desktop. Drag and drop one (or even more) multipage PDF on it and it will create one standalone PDF file for each page of your PDF, appending the suffix -001, -002 and so on to the name to distinguish the pages.

You might need to customize (with relevant Ghostscript version) or remove the set path=... line if you already have Ghostscript folders in your system PATH environment variable.

It works for me under Windows 10 with Ghostscript 9.22.

Enjoy.




回答5:


Here's a simple python script which does it:

#!/usr/bin/python3

import os

number_of_pages = 68
input_pdf = "abstracts_rev09.pdf"

for i in range(1, number_of_pages +1):
    os.system("gs -q -dBATCH -dNOPAUSE -sOutputFile=page{page:04d}.pdf"
              " -dFirstPage={page} -dLastPage={page}"
              " -sDEVICE=pdfwrite {input_pdf}"
              .format(page=i, input_pdf=input_pdf))


来源:https://stackoverflow.com/questions/10228592/splitting-a-pdf-with-ghostscript

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!