问题
I have been trying to write a simple Markdown -> docx parser/writer, but am completely stuck with the last part, which should be the easiest: i.e. compressing the folder into a .docx that Word, or any other .docx reader, will recognize.
My parser-writer is irrelevant really: I have this problem if I simply unzip any old Word-produced *.docx and then try to recompress it with the usual compression utilities, giving it the file-ending docx. Is there some mysterious header I should be adding, or do I need a special OPC compression utility, or what?
I don't so much want a tool that will do this, as to figure out what is supposed to be there. It seems to be independent of the WordprocessingML specification.
Needless to say I don't know anything about compression. Everything I can find via Google has to do with fancy utilities you can use in business, but I'm making a little executable that would be GPLd or something, and should work on anything.
回答1:
The most common problem around manually zipping together Open XML documents is that it will not work if you zip the directory instead of the contents. In other words, the[content_types].xml file, and the word, docProps, and _rels directories need to reside at the root level of the zip file.
回答2:
Here are steps to unzip my.docx and re-zip:
% mkdir unzipped
% cd unzipped/
% unzip ../my.docx
% zip -r ../rezipped.docx *
% open ../rezipped.docx
回答3:
Further to what Mica said, the contents of the ZIP file are organised according to the Open Packaging Convention; cf. Microsoft's Essentials of the Open Packaging Convention.
You can use the .NET System.IO.Packaging to make and manipulate .docx files; this class is implemented in the Mono project.
回答4:
The compression algorithm used is "Zip" (Base 64) compression.
7zip seems to offer this, though i have no tested it.
来源:https://stackoverflow.com/questions/1514052/how-to-zip-a-wordprocessingml-folder-into-readable-docx