Are there any docx repair tools that give a meaningful error message?

风流意气都作罢 提交于 2019-12-11 03:42:58

问题


I am struggling to find the cause of a corrupt docx file.

It seems that there are millions tools out there for repairing corrupted files - I've tried 5 that all repaired beautifully, but none of them gave any indication of the error origin.

Does anybody know of one that does?

Open source would be a bonus.

Thanks.

UPDATE:

I tried using the Open XML SDK 2.0 Productivity Tool as recommended by frankpl. It looked promising, but it refused to open my corrupt file either standalone or to compare with another.

I found a difference between the [Content_Types].xml part of the file, but on closer It's just the order that's different - I presume this wouldn't account for corruption?

In the valid (repaired by Word) file:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
    <Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
    <Default Extension="xml" ContentType="application/xml"/>
    <Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
    <Override PartName="/word/numbering.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.numbering+xml"/>
    <Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml"/>
    <Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml"/>
    <Override PartName="/word/stylesWithEffects.xml" ContentType="application/vnd.ms-word.stylesWithEffects+xml"/>
    <Override PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.settings+xml"/>
    <Override PartName="/word/theme/theme1.xml" ContentType="application/vnd.openxmlformats-officedocument.theme+xml"/>
    <Override PartName="/word/fontTable.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xml"/>
    <Override PartName="/word/webSettings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.webSettings+xml"/>
    <Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml"/>
</Types>

And in the corrupt file:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
    <Default Extension="xml" ContentType="application/xml"/>
    <Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
    <Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
    <Override PartName="/word/numbering.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.numbering+xml"/>
    <Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml"/>
    <Override PartName="/word/stylesWithEffects.xml" ContentType="application/vnd.ms-word.stylesWithEffects+xml"/>
    <Override PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.settings+xml"/>
    <Override PartName="/word/webSettings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.webSettings+xml"/>
    <Override PartName="/word/fontTable.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xml"/>
    <Override PartName="/word/theme/theme1.xml" ContentType="application/vnd.openxmlformats-officedocument.theme+xml"/>
    <Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml"/>
    <Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml"/>
</Types>

回答1:


Here's a list of DOCX repair tools that are free:

http://www.docxrepairtoolbox.com/

http://sourceforge.net/projects/damageddocx2txt/

http://sourceforge.net/projects/quickwordrecovr/

http://download.cnet.com/SysInfoTools-Docx-Repair/3000-2248_4-75330500.html




回答2:


Not a docx repair tool, but the Open XML SDK 2.0 for Microsoft Office contains a tool named "Open XML SDK 2.0 Productivity Tool for Microsoft Office" that you can use to compare two docx files (like the corrupt and the working one).




回答3:


Old question I know, but just to say for anyone with similar problems.

The above content files wont be source of issue. (order isn't an issue it's just what word does on repair - renumbers ids and reorders)

Something that can cause corruption is simply having extra files in the zip that doesn't belong there.

Most of the time when it throws it's hand up in air and doesn't give you a hint, it's the structural metadata that has gone wrong.

By that I mean, not an invalid pointer to a relationship id in the document.xml (for example), but an invalid relationship file itself. For example, pointing to a content type in document.xml.rels that isn't in [Content_Types].xml.

However, when word repairs everything it renumbers all it's ids (and reorders) so compare tools are difficult.

Check the list of files is same, concentrate on things such as [Content_Types].xml and document.xml.rels (and other rels files), and good luck!




回答4:


Many years late, but you can create your own error checker using DocumentFormat.OpenXml.Validation: https://msdn.microsoft.com/en-us/library/office/bb497334.aspx



来源:https://stackoverflow.com/questions/18215615/are-there-any-docx-repair-tools-that-give-a-meaningful-error-message

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!