How to handle multi-page images in PythonMagick?

烂漫一生 提交于 2019-12-21 04:39:34

问题


I want to convert some multi-pages .tif or .pdf files to individual .png images. From command line (using ImageMagick) I just do:

convert multi_page.pdf file_out.png

And I get all the pages as individual images (file_out-0.png, file_out-1.png, ...)

I would like to handle this file conversion within Python, unfortunately PIL cannot read .pdf files, so I want to use PythonMagick. I tried:

import PythonMagick
im = PythonMagick.Image('multi_page.pdf')
im.write("file_out%d.png")

or just

im.write("file_out.png")

But I only get 1 page converted to png. Of course I could load each pages individually and convert them one by one. But there must be a way to do them all at once?


回答1:


ImageMagick is not memory efficient, so if you try to read a large pdf, like 100 pages or so, the memory requirement will be huge and it might crash or seriously slow down your system. So after all reading all pages at once with PythonMagick is a bad idea, its not safe. So for pdfs, I ended up doing it page by page, but for that I need to get the number of pages first using pyPdf, its reasonably fast:

pdf_im = pyPdf.PdfFileReader(file('multi_page.pdf', "rb"))
npage = pdf_im.getNumPages()
for p in npage:
    im = PythonMagick.Image('multi_page.pdf['+ str(p) +']')
    im.write('file_out-' + str(p)+ '.png')



回答2:


A more complete example based on the answer by Ivo Flipse and http://p-s.co.nz/wordpress/pdf-to-png-using-pythonmagick/

This uses a higher resolution and uses PyPDF2 instead of older pyPDF.

import sys
import PyPDF2
import PythonMagick

pdffilename = sys.argv[1] 
pdf_im = PyPDF2.PdfFileReader(file(pdffilename, "rb"))
npage = pdf_im.getNumPages()
print('Converting %d pages.' % npage)
for p in range(npage):
    im = PythonMagick.Image()
    im.density('300')
    im.read(pdffilename + '[' + str(p) +']')
    im.write('file_out-' + str(p)+ '.png')



回答3:


I had the same problem and as a work around i used ImageMagick and did

import subprocess
params = ['convert', 'src.pdf', 'out.png']
subprocess.check_call(params)


来源:https://stackoverflow.com/questions/10489960/how-to-handle-multi-page-images-in-pythonmagick

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!