Why does combining PDF pages with CGContextDrawPDFPage create very large output files?

一世执手 提交于 2019-12-11 01:05:48

问题


I ran into this trying to throw together a simple Automator script to combine several one-page PDF files. I had 88 files to combine, each just about exactly 300KB, so I expected the final product to be about 30MB; the resulting PDF file, using the Combine PDFs Automator action, was 300+MB.

Poking around, the Automator action uses a Python script, with Foundation bindings, to create the new PDF document with the CoreGraphics PDF APIs. Nothing seems out of place. Basically, it's doing this (simplified, but these are the high points):

writeContext = CGPDFContextCreateWithURL(outURL, None, None)
for url in inURLs:
    doc = CGPDFDocumentCreateWithURL(url)
    page = CGPDFDocumentGetPage(doc, 1)
    mediaBox = CGPDFPageGetBoxRect(page, kCGPDFMediaBox)
    CGContextBeginPage(writeContext, mediaBox)
    CGContextDrawPDFPage(writeContext, page)
    CGContextEndPage(writeContext)
CGPDFContextClose(writeContext)

I can't imagine that CGContextDrawPDFPage, when drawing to a PDF context, would do anything but copy the PDF data for that page (with some window-dressing).

Even when "combining" just one PDF, the output is 2.8MB, compared to the 300KB original one-page PDF.

The resulting PDFs look exactly the same page-by-page as the original pages: text is selectable in the same places, graphics look identical, the pages are exactly the same size.

Any ideas?


回答1:


Do the input PDFs contain the same set of fonts, or different sets? Maybe if the originals don't contain embedded fonts, but the output does, that could account for some of the growth.



来源:https://stackoverflow.com/questions/3099312/why-does-combining-pdf-pages-with-cgcontextdrawpdfpage-create-very-large-output

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!