Has anyone been able to use poppler new_from_data in python?

杀马特。学长 韩版系。学妹 提交于 2019-12-24 09:03:56

问题


Using Python3, and Poppler, I can load files with new_from_file without problem, but new_from_data is problematic. Here is the code which is obviously a simple test, because it does not make sense to read from file and then use new_from_data, since new_from_file works perfectly, but I could not post here the full code generating the pdf file.

from gi.repository import Poppler, Gtk

def draw(widget, cr):
        # set background.
        cr.set_source_rgb(0.7, 0.6, 0.5)
        cr.paint()

        # set page background
        cr.set_source_rgb(1, 1, 1)
        cr.rectangle(0,0,800,400)

        cr.fill()
        page.render(cr)

filepath = "d:/Mes Documents/A5.pdf" 
f11 = open(filepath, "r", encoding = "cp850")
data1 = f11.read()
f11.close()

document = Poppler.Document.new_from_data(data1, len(data1),  None)
page = document.get_page(0)
print (document.get_n_pages())


window = Gtk.Window(title="Hello World")
window.connect("delete-event", Gtk.main_quit)
window.connect("draw", draw)
window.set_app_paintable(True)

window.show_all()
Gtk.main()

Four different situations may happen :

  • With a very simple pdf (the "Hello world" example in Pdf Reference 13), it works.
  • With a normal file, there may be no error, but get_n_pages returns 0, and get_page(0) returns None
  • Or I may get an error : GLib.Error: poppler-quark: PDF document is damaged (4)
  • Or the program crashs

I wonder if the problem may be with the encoding parameter, but I tried everything I thought of without result. I tried with "rb" and then converting bytes array to string with :

data1 = "".join(map(data1))

No result.

Search on Google never returned a working example


回答1:


I ran into the same problem, solved it using Gio.MemoryInputStream. Not really elegant but it works...

from gi.repository import Poppler, Gtk, Gio

def draw(widget, cr):
        # set background.
        cr.set_source_rgb(0.7, 0.6, 0.5)
        cr.paint()

        # set page background
        cr.set_source_rgb(1, 1, 1)
        cr.rectangle(0,0,800,400)

        cr.fill()
        page.render(cr)

filepath = "d:/Mes Documents/A5.pdf" 
with open(filepath, "rb") as f11:
    input_stream = Gio.MemoryInputStream.new_from_data(f11.read())
    # Take care that you need to call .close() on the Gio.MemoryInputStream once you're done with your pdf document.

document = Poppler.Document.new_from_stream(input_stream, -1, None, None)
page = document.get_page(0)
print (document.get_n_pages())


window = Gtk.Window(title="Hello World")
window.connect("delete-event", Gtk.main_quit)
window.connect("draw", draw)
window.set_app_paintable(True)

window.show_all()
Gtk.main()


来源:https://stackoverflow.com/questions/42735374/has-anyone-been-able-to-use-poppler-new-from-data-in-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!