问题
I am struggling to find the cause of a corrupt docx file.
It seems that there are millions tools out there for repairing corrupted files - I've tried 5 that all repaired beautifully, but none of them gave any indication of the error origin.
Does anybody know of one that does?
Open source would be a bonus.
Thanks.
UPDATE:
I tried using the Open XML SDK 2.0 Productivity Tool as recommended by frankpl. It looked promising, but it refused to open my corrupt file either standalone or to compare with another.
I found a difference between the [Content_Types].xml part of the file, but on closer It's just the order that's different - I presume this wouldn't account for corruption?
In the valid (repaired by Word) file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
<Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
<Default Extension="xml" ContentType="application/xml"/>
<Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
<Override PartName="/word/numbering.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.numbering+xml"/>
<Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml"/>
<Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml"/>
<Override PartName="/word/stylesWithEffects.xml" ContentType="application/vnd.ms-word.stylesWithEffects+xml"/>
<Override PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.settings+xml"/>
<Override PartName="/word/theme/theme1.xml" ContentType="application/vnd.openxmlformats-officedocument.theme+xml"/>
<Override PartName="/word/fontTable.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xml"/>
<Override PartName="/word/webSettings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.webSettings+xml"/>
<Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml"/>
</Types>
And in the corrupt file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
<Default Extension="xml" ContentType="application/xml"/>
<Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
<Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
<Override PartName="/word/numbering.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.numbering+xml"/>
<Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml"/>
<Override PartName="/word/stylesWithEffects.xml" ContentType="application/vnd.ms-word.stylesWithEffects+xml"/>
<Override PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.settings+xml"/>
<Override PartName="/word/webSettings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.webSettings+xml"/>
<Override PartName="/word/fontTable.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xml"/>
<Override PartName="/word/theme/theme1.xml" ContentType="application/vnd.openxmlformats-officedocument.theme+xml"/>
<Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml"/>
<Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml"/>
</Types>
回答1:
Here's a list of DOCX repair tools that are free:
http://www.docxrepairtoolbox.com/
http://sourceforge.net/projects/damageddocx2txt/
http://sourceforge.net/projects/quickwordrecovr/
http://download.cnet.com/SysInfoTools-Docx-Repair/3000-2248_4-75330500.html
回答2:
Not a docx repair tool, but the Open XML SDK 2.0 for Microsoft Office contains a tool named "Open XML SDK 2.0 Productivity Tool for Microsoft Office" that you can use to compare two docx files (like the corrupt and the working one).
回答3:
Old question I know, but just to say for anyone with similar problems.
The above content files wont be source of issue. (order isn't an issue it's just what word does on repair - renumbers ids and reorders)
Something that can cause corruption is simply having extra files in the zip that doesn't belong there.
Most of the time when it throws it's hand up in air and doesn't give you a hint, it's the structural metadata that has gone wrong.
By that I mean, not an invalid pointer to a relationship id in the document.xml (for example), but an invalid relationship file itself. For example, pointing to a content type in document.xml.rels that isn't in [Content_Types].xml.
However, when word repairs everything it renumbers all it's ids (and reorders) so compare tools are difficult.
Check the list of files is same, concentrate on things such as [Content_Types].xml and document.xml.rels (and other rels files), and good luck!
回答4:
Many years late, but you can create your own error checker using DocumentFormat.OpenXml.Validation
: https://msdn.microsoft.com/en-us/library/office/bb497334.aspx
来源:https://stackoverflow.com/questions/18215615/are-there-any-docx-repair-tools-that-give-a-meaningful-error-message