Manipulating Microsoft Word Office 2007 .docx document from PHP

China☆狼群 提交于 2019-12-08 07:02:07

问题


I need an option from within PHP to Manipulate .docx (Microsoft Office 2007) document.

I need to:

  1. Read the internal text
  2. Convert to .html
  3. To view them inside a browser.
  4. To replace text.

I know I can use Word Automation, creating a COM object of Microsoft Word, but it's too slow, unstable and I have to have it installed on the server.

Is there any library or code that can do it from PHP?


回答1:


There is PHPWord for that by the authors of PHPExcel.




回答2:


Docx is just a ZIP file containing multiple XML files and embedded media files like images. Because of this, you can read and edit the document with ease. Just unzip it, open word/document.xml, do reading & writing, and repack the files.

Convet to HTML may be difficult. But you'll find a thumbnail of the first page in docProps/thumbnail.jpeg.

Note that you'll have to familiarize yourself with the XML structure to do any complex edits. There's a summary XML docProps/app.xml which has some metadata for the file so don't forget to update it. Read more from Wikipedia: http://en.wikipedia.org/wiki/Office_Open_XML




回答3:


You may have a look at PHPDocX I believe it does all you are asking for.

  1. You may replace variables in a template or just plain text from a prexisting Word document.
  2. It offers quite a few conversion options.
  3. You can also extract the text.



回答4:


You can work with the internal format directly.

DOCX is just a zip file, and inside that there's word/document.xml containing the actual document.

It's quite trivial to unzip the file, read document.xml, str_replace() what you're looking for, save it and re-zip the directory, and it makes for a lightweight, quick and easy mail merge capability for word documents. This also works for other office formats.

Here's the official docs on the internal structure for more information.




回答5:


There is also a PHP class for merging new content into an existing .docx file. It is available here: http://www.tinybutstrong.com/ . The documentation is pretty good as well as having many examples and it is all free and open source. It does require familiarity with the .docx concepts, though.



来源:https://stackoverflow.com/questions/3307163/manipulating-microsoft-word-office-2007-docx-document-from-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!