I am working on a UNIX system and I\'d like to merge thousands of PDF files into one file in order to print it. I don\'t know how many pages they are in advance.
I\'
pyPDF
package./path/to/blank.pdf
(I've created blank pdf pages here).pdfmerge.py
in any directory of your $PATH
. (I'm not a Windows user. This is straight forward under Linux. Please let me know if you get errors / if it works.)pdfmerge.py
executableRun uniprint.py
a directory that contains only PDF files you want to merge.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from argparse import ArgumentParser
from glob import glob
from pyPdf import PdfFileReader, PdfFileWriter
def merge(path, blank_filename, output_filename):
blank = PdfFileReader(file(blank_filename, "rb"))
output = PdfFileWriter()
for pdffile in glob('*.pdf'):
if pdffile == output_filename:
continue
print("Parse '%s'" % pdffile)
document = PdfFileReader(open(pdffile, 'rb'))
for i in range(document.getNumPages()):
output.addPage(document.getPage(i))
if document.getNumPages() % 2 == 1:
output.addPage(blank.getPage(0))
print("Add blank page to '%s' (had %i pages)" % (pdffile, document.getNumPages()))
print("Start writing '%s'" % output_filename)
output_stream = file(output_filename, "wb")
output.write(output_stream)
output_stream.close()
if __name__ == "__main__":
parser = ArgumentParser()
# Add more options if you like
parser.add_argument("-o", "--output", dest="output_filename", default="merged.pdf",
help="write merged PDF to FILE", metavar="FILE")
parser.add_argument("-b", "--blank", dest="blank_filename", default="blank.pdf",
help="path to blank PDF file", metavar="FILE")
parser.add_argument("-p", "--path", dest="path", default=".",
help="path of source PDF files")
args = parser.parse_args()
merge(args.path, args.blank_filename, args.output_filename)
Please make a comment if this works on Windows and Mac.
Please always leave a comment if it doesn't work / it could be improved.
It works on Linux. Joining 3 PDFs to a single 200-page PDF took less then a second.
The code by @Chris Lercher in https://stackoverflow.com/a/12761103/1369181 did not quite work for me. I do not know whether that is because I am working on Cygwin/mintty. Also, I have to use qpdf
instead of pdftk
. Here is the code that has worked for me:
#!/bin/bash
for f in *.pdf; do
npages=$(pdfinfo "$f"|grep 'Pages:'|sed 's/[^0-9]*//g')
modulo=$(($npages %2))
if [ $modulo -eq 1 ]; then
qpdf --empty --pages "$f" "path/to/blank.pdf" -- "aligned_$f"
else
cp "$f" "aligned_$f"
fi
done
Now, all "aligned_" files have even page numbers, and I can join them using qpdf
(thanks to https://stackoverflow.com/a/51080927):
qpdf --verbose --empty --pages aligned_* -- all.pdf
And here the useful code from https://unix.stackexchange.com/a/272878 that I have used for creating the blank page:
echo "" | ps2pdf -sPAPERSIZE=a4 - blank.pdf