问题
I try to split a multipage PDF with Ghostscript, and I found the same solution on more sites and even on ghostscript.com, namely:
gs -sDEVICE=pdfwrite -dSAFER -o outname.%d.pdf input.pdf
But it seems not working for me, because it produces one file, with all pages, and with the name outname.1.pdf.
When I add the start and end pages, then it is working fine, but I want it to work without knowing those parameters.
In the gs-devel archive, I found a solution for this:
http://ghostscript.com/pipermail/gs-devel/2009-April/008310.html --
but I feel like doing it without pdf_info
.
When I use a different device, for example pswrite
, but same
parameters, it works correctly, producing as many ps files, as my
input.pdf contains.
Is this normal when using pdfwrite
? Am I doing something wrong?
回答1:
What you see is "normal" behaviour: the current version of Ghostscript's pdfwrite
output device does not support this feature. This is also (admittedly, somehow vaguely) documented in Use.htm:
"Note, however that the one page per file feature may not be supported by all devices...."
I seem to remember that one of the Ghostscript developers mentioned on IRC that they may add this feature to pdfwrite in some future release, but it seems to necessitate some major code rewrite, which is why they haven't done it yet...
Update: As Gordon's comment already hinted at, as of version 9.06 (released on July 31st, 2012), Ghostscript now supports the commandline as quoted in the question also for pdfwrite
. (Gordon must have discovered the unofficial support for this already in 9.05, or he compiled his own executable from the pre-release sources which were not yet tagged as 9.06).
回答2:
I found this script wriiten by Mr Weimer super useful:
#!/bin/sh
#
# pdfsplit [input.pdf] [first_page] [last_page] [output.pdf]
#
# Example: pdfsplit big_file.pdf 10 20 pages_ten_to_twenty.pdf
#
# written by: Westley Weimer, Wed Mar 19 17:58:09 EDT 2008
#
# The trick: ghostscript (gs) will do PDF splitting for you, it's just not
# obvious and the required defines are not listed in the manual page.
if [ $# -lt 4 ]
then
echo "Usage: pdfsplit input.pdf first_page last_page output.pdf"
exit 1
fi
yes | gs -dBATCH -sOutputFile="$4" -dFirstPage=$2 -dLastPage=$3 -sDEVICE=pdfwrite "$1" >& /dev/null
Origin from : http://www.cs.virginia.edu/~weimer/pdfsplit/pdfsplit
save it as pdfsplit.sh
, see the magic happens.
PDFSAM also could do the job. Available on Windows and Mac.
回答3:
#!/bin/bash
#where $1 is the input filename
ournum=`gs -q -dNODISPLAY -c "("$1") (r) file runpdfbegin pdfpagecount = quit" 2>/dev/null`
echo "Processing $ournum pages"
counter=1
while [ $counter -le $ournum ] ; do
newname=`echo $1 | sed -e s/\.pdf//g`
reallynewname=$newname-$counter.pdf
counterplus=$((counter+1))
# make the individual pdf page
yes | gs -dBATCH -sOutputFile="$reallynewname" -dFirstPage=$counter -dLastPage=$counter -sDEVICE=pdfwrite "$1" >& /dev/null
counter=$counterplus
done
回答4:
Here is a script for Windows command prompt (working also with drag and drop) assuming you have Ghostscript installed:
@echo off
chcp 65001
setlocal enabledelayedexpansion
rem Customize or remove this line if you already have Ghostscript folders in your system PATH
set path=C:\Program Files\gs\gs9.22\lib;C:\Program Files\gs\gs9.22\bin;%path%
:start
echo Splitting "%~n1%~x1" into standalone single pages...
cd %~d1%~p1
rem getting number of pages of PDF with GhostScript
for /f "usebackq delims=" %%a in (`gswin64c -q -dNODISPLAY -c "(%~n1%~x1) (r) file runpdfbegin pdfpagecount = quit"`) do set "numpages=%%a"
for /L %%n in (1,1,%numpages%) do (
echo Extracting page %%n of %numpages%...
set "x=00%%n"
set "x=!x:~-3!"
gswin64c.exe -dNumRenderingThreads=2 -dBATCH -dNOPAUSE -dQUIET -dFirstPage=%%n -dLastPage=%%n -sDEVICE=pdfwrite -sOutputFile="%~d1%~p1%~n1-!x!.pdf" "%1"
)
shift
if NOT x%1==x goto start
pause
Name this script something like split PDF.bat
and put it on your desktop. Drag and drop one (or even more) multipage PDF on it and it will create one standalone PDF file for each page of your PDF, appending the suffix -001
, -002
and so on to the name to distinguish the pages.
You might need to customize (with relevant Ghostscript version) or remove the set path=...
line if you already have Ghostscript folders in your system PATH environment variable.
It works for me under Windows 10 with Ghostscript 9.22.
Enjoy.
回答5:
Here's a simple python script which does it:
#!/usr/bin/python3
import os
number_of_pages = 68
input_pdf = "abstracts_rev09.pdf"
for i in range(1, number_of_pages +1):
os.system("gs -q -dBATCH -dNOPAUSE -sOutputFile=page{page:04d}.pdf"
" -dFirstPage={page} -dLastPage={page}"
" -sDEVICE=pdfwrite {input_pdf}"
.format(page=i, input_pdf=input_pdf))
来源:https://stackoverflow.com/questions/10228592/splitting-a-pdf-with-ghostscript