Is there any PHP PDF library that can replace placeholder variables in an existing PDF, ODT or DOCX document, and generate a PDF file as the end result, without screwing up the layout?
Requirements:
Needs no 3rd party web service
Ability to run on shared web hosting would be ideal (no binary installations / packages required)
Mind you, a library that is able to load an existing PDF file and insert text programmatically at a specific position is not enough for my use case.
As far as my research shows, there is no library that can do this:
TCPDF can only generate documents from scratch
FPDI can read existing PDF templates, but can only add contents programmatically (no template variable replacement)
There are various DOCX/ODT template libraries out there but they don't output PDF
PHPDOCx claims to be able to do exactly what I need - but they don't offer a trial version and I'm not going to buy a cat in a bag, especially not when there seems to be no other product on the web that does this. I find it hard to believe they can do this without problems - if you have successfully done this using the product, please drop a line here.
Am I overlooking something?
Is there a way to do this using PDF forms? I am creating the source documents in OpenOffice 3.
I may be able to use standard Linux commands (pdftk
is available for example, trying that out right now.)
Update: *Argh!* I was called out of the office and the bounty expired in the meantime. Starting a new bounty: As far as my testing shows, no solution works for me perfectly yet.
Update II: I will be looking the pdftk approach soon, but I am also starting another bounty for one more round of collecting additional input. This question has now seen 1300 rep points in bounties, must be some kind of a record :)
Pekka,
I looked in to this previously, I think you can use pdftk (a command line utility), to fill in a PDF form using FDF/XFDF data files, which you could easily generate from within PHP. That was the best option I've seen so far, though there may well be a native library.
pdftk is quite useful in general, worth having a look at.
Update: Have a look here: http://php.net/manual/en/book.fdf.php
This is not very practical, but for completeness: If you already have an ODT template, then you might very well retain that as template. Modifying the OpenDocument content.xml and replacing placeholders therein is pretty simple. If so, you could use unoconv
or pyodconverter
to transform the ODT into a final PDF.
unoconv -f pdf -o final.pdf template.odt
Very obviously this requires a full OpenOffice setup (UNO and Writer) on the webserver. And obviously not every webhoster would go with that! haha. Even if it's simple on any Debian or Fedora setup. The execution speed would probably not be stellar either. But then it might be the cleanest approach, since OOo governs both formats way better than any PHP class ever could.
You didnt make free a requirement, so I'm gonna suggest LiveDocX. They have a free option, but it does not meet your not a 3rd party requirement, which means you would have to server-license it).
There is a also a ZF package for this:
Have you considered using something like XSL:Formatting Objects (XSL:FO)? Basically they're XML documents that are processed and turned into PDFs. Doing string - or better, DOM - replacements within that should be pretty simple. It supports embedding images, links, annotations, etc.
It's not PHP but there are a number of PHP wrappers for it along with ways of using it via exec, etc. Not an ideal but it takes care of the template portion completely. For some more info: http://techportal.inviqa.com/2009/12/16/transforming-xml-with-php-and-xsl/
There's an implementation available as an Apache project - http://xmlgraphics.apache.org/fop/
This won't meet your "no binary" mandate, but PDFlib has a nice templating system using "blocks", which are PDFlib-specific metadata you can "draw" into a PDF using an Acrobat plugin, then programatically fill-in at runtime with a single function call per-block. The blocks can be text (with pseudo-html for simple text formatting of font face/size/color), images (most anything that can be displayed in a PDF), and PDFs themselves, so you can embed entire other documents, or snippets of documents in a block.
It can do this from a pre-existing PDF (load it, insert content/fill block/etc.., output), or build a new PDF from the ground up (seriously painful to do programatically, but possible).
I've used the blocks successfully to generate a simple templating system for a client, where they could draw blocks into a document with small bits of metadata embedded, which was then parsed server-side to generate a form to prompt for data to insert into those blocks at document generation time.
fpdf and there is another extention on top of it, which I can't remember, which allows you to import templates
Your best bet would be to generate the entire document on the fly, with the template defined programatically using fpdf or something similar. That way, your text will not be cut off by paragraphs or anything like that, and you can easily position images/other elements as required.
PDFlib is nice, and your shared host might have it already. I was able to accomplish things like this http://www.housejockey.com/flyers/12/80/MyListing.pdf with PDFlib fairly easily, but the basic technique is the drop-in paragraph text...that you said was no-good.
I also worked with abcPDF for ASP/ASP.NET and found it to be pretty good for poking inside PDFs but, again, not designed for digging around inside existing text areas.
Working with robust templates and placeholders is a leap of complexity above drop-in content. Reflowing existing text in a PDF is generally not possible. It requires additional document structure, which is what expensive variable data software packages like XMPie and Creo are designed to do...
PHPDOCX is starting to sound really good! Good luck.
Late, but you can use OpenSource template designer https://github.com/applicius/dhek/releases , to define pkaceholders/areas over any existing PDF, then load it in PHP (as it's JSON format) and write accordingly on original PDF using fpdf lib, to generate custom PDF with dynamic data written on.
Altough not exactly thing you asked, you may consider to make it at two steps: using some php templating sytem (smarty, dwoo) to generate html page and then using tools like Html2Pdf convert it to pdf. I am using it, and results are good (no problems with page layout etc)
Of course it depends of your input documents (can you use html instead of PDF/ ODT as source ) and complexity of the layout of those.
Ok I'm trying to help you solve the problem a little.
First the answer for couple of your question.
Q - Am I overlooking something?
A - No. There is a PHP PDF library that can replace placeholder variables in an existing PDF and generate a PDF file as the end result, without screwing up the layout
Q - Is there a way to do this using PDF forms?
A - Yes. absolutelly the tric to doing this is by using a PDF Forms
For both answer you can use Justin Koivisto fill pdf form field php library. For more detail you please go to http://koivi.com/fill-pdf-form-fields/tutorial.php. Take a look there for additional information.
Credit to Justin Koivisto for his work
P.S
For workaround for displaying a table like output from pdf form please consider to take some reading on Oracle Business Intelligence Publisher User's Guide - Creating a PDF Template
I was able to replace placeholder values in a .docx via a library on github called phpdocx (https://github.com/djpate/phpdocx) (not the one referenced elsewhere)
I then successfully put a portable version of libreoffice on my host's webserver, which I call with PHP to do a commandline conversion from .docx, etc. to pdf. on the fly. I do not have admin rights on my host's webserver. Here is my blog post of what I did:
This partially answers your question. I hope it is helpful.
I'll add this new answer since the FDF PHP extension is now dead.
I've just followed these instructions and ended up executing one perl script then the pdftk command
I'm pretty aware it's far from being a real PHP solution but it's reliable and fairly easy to implement on any *nix platform.
The tools described there are also available on Debian, just in case you were wondering.
It's a litte bit late but have a look at the PDFTemplate Library it does exatly what you want. You can create Open Document files (odt) and add placeholders in it. The PDFTemplate library can fill out these placeholders (even with images) and create a PDF file.
If you're willing to use an external service, PDF Otter could solve your problem. There's a free plan that you can sign up for and immediately integrate in your application.
Additional pros with using PDF Otter include fewer dependencies in your application and a friendlier experience selecting the fields you need to fill in (I've used manual trial-and-error to get the positions of fields in the past, and it was time consuming).
DOM Pdf and mPdf are the two libraries available to achieve this.
I would advise you to take a look at Aspose.Word Cloud SDK. With this library you can generate PDF based on DOCX, ODT, as well as convert, create, modify you files.
It doesnt have any dependencies on MS Office or OpenDocument, so you dont need them to be installed. Aspose has 30-day trial period.
Note: I am working as Developer Evangelist at Aspose.
来源:https://stackoverflow.com/questions/4416667/php-pdf-template-library-with-pdf-output