问题
We have a system that stores some custom templating data in a Word document. Sometimes, updating this data causes Word to complain that the document is corrupted. When that happens, if I unzip the docx file and compare the contents to the previous version, the only difference appears to be the expected change in the customXML\item.xml
file. If I re-zip the contents using 7zip, it seems to work OK (Word no longer complains that the document is corrupt).
The (simplified) code:
void CreateOrReplaceCustomXml(string filename, MyCustomData data)
{
using (var doc = WordProcessingDocument.Open(filename, true))
{
var part = GetCustomXmlParts(doc).SingleOrDefault();
if (part == null)
{
part = doc.MainDocumentPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
}
var serializer = new DataContractSerializer(typeof(MyCustomData));
using (var stream = new MemoryStream())
{
serializer.WriteObject(stream, data);
stream.Seek(0, SeekOrigin.Begin);
part.FeedData(stream);
}
}
}
IEnumerable<CustomXmlPart> GetCustomXmlParts(WordProcessingDocument doc)
{
return doc.MainDocumentPart.CustomXmlParts
.Where(part =>
{
using (var stream = doc.Package.GePart(c.Uri).GetStream())
using (var streamReader = new StreamReader(stream))
{
return streamReader.ReadToEnd().Contains("Some.Namespace");
}
});
}
Any suggestions?
回答1:
Since re-zipping works, it seems the content is well-formed.
So it sounds like the zip process is at fault. So open the corrupted docx in 7-Zip, and take note of the values in the "method" column (especially for customXML\item.xml).
Compare that value to a working docx - is it the same or different? Method "Deflate" works.
回答2:
I faced the same issue and it turned out it was due to encoding. Do you already specify the same encoding when serializing/deserializing?
回答3:
Couple of suggestion a. Try doc.Package.Flush(); after you write the data back into the custom xml. b. You may have to delete all custom part and add a new custom part. We are using the following code and it seems working fine.
public static void ReplaceCustomXML(WordprocessingDocument myDoc, string customXML)
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
mainPart.DeleteParts<CustomXmlPart>(mainPart.CustomXmlParts);
CustomXmlPart customXmlPart = mainPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using (StreamWriter ts = new StreamWriter(customXmlPart.GetStream()))
{
ts.Write(customXML);
ts.Flush();
ts.Close();
}
}
public static MemoryStream GetCustomXmlPart(MainDocumentPart mainPart)
{
foreach (CustomXmlPart part in mainPart.CustomXmlParts)
{
using (XmlTextReader reader =
new XmlTextReader(part.GetStream(FileMode.Open, FileAccess.Read)))
{
reader.MoveToContent();
if (reader.Name.Equals("aaaa", StringComparison.OrdinalIgnoreCase))
{
string str = reader.ReadOuterXml();
byte[] byteArray = Encoding.ASCII.GetBytes(str);
MemoryStream stream = new MemoryStream(byteArray);
return stream;
}
}
}
return null; //result;
}
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(ms, true))
{
StreamReader reader = new StreamReader(memStream);
string FullXML = reader.ReadToEnd();
ReplaceCustomXML(myDoc, FullXML);
myDoc.Package.Flush();
//Code to save file
}
来源:https://stackoverflow.com/questions/23672955/editing-custom-xml-part-in-word-document-sometimes-corrupts-document