What is the structure of a .docx and .doc file?

邮差的信 提交于 2021-02-07 21:11:45

问题


I have learned that .docx files are basically binary files. But I'm unaware of the structure that lies beneath.

What is the essential structure of a .docx file? Like, how long is the header? From what point does the actual document content start? Does it have any signature at the end?

Basically, what's the anatomy of a .docx file?


回答1:


Docx is basically a zip archive with a lot of xml files in it. It is an open format and the documentation is available online. The wikipedia article has a general description and the links you will need.




回答2:


Your question is: "What's the Anatomy of a DocX File?"

Please see the official OOXML article, "Anatomy of OOXML," for an example DocX directory structure :

http://officeopenxml.com/anatomyofOOXML.php

For an example DocX XML document :

http://officeopenxml.com/WPsampleDoc.php

However, after following these meticulously, and guessing where the details got foggy, I was unable to make the docx file.

I chose this short cut : Make a Docx file in Libre Office (supports .docx extensions), make a generic template in the format of the docx files you expect to be generating, save the file as .docx, copy and save as .zip.

Open this .zip directory, and what you'll see I found to be much better at explaining the spec than the above, official links.



来源:https://stackoverflow.com/questions/40037905/what-is-the-structure-of-a-docx-and-doc-file

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!