OpenXml: Worksheet Child Elements change in ordering results in a corrupt file

假装没事ソ 提交于 2019-11-30 20:13:14

According to the Standard ECMA-376 Office Open XML File Formats, CT_Worksheet has a required sequence:

The reason the following is crashing:

workSheet.Append(sheetViews);
workSheet.Append(columns);
workSheet.Append(sheetData);
workSheet.Append(drawing);
workSheet.Append(mergeCells);

Is because you have drawing before mergeCells. As long as you append your mergeCells after drawing, your code should work fine.

Note: You can find the full XSD in ECMA-376 3rd edition Part 1 (.zip) -> OfficeOpenXML-XMLSchema-Strict -> sml.xsd.

I found that for all "Singleton" children where the parent objects has a Property defined (such as Worksheet.sheetViews) use the singleton property and assign the new object to that instead of using "Append" This causes the class itself to ensure the order is correct.

workSheet.Append(sheetViews);
workSheet.Append(columns);
workSheet.Append(sheetData);  // bad idea(though it does work if the order is good)
workSheet.Append(drawing);
workSheet.Append(mergeCells);

More correct format...

workSheet.sheetViews=sheetViews; // order doesn't matter.
workSheet.columns=columns;
...
helb

As Joe Masilotti already explained, the order is defined in the schema.

Unfortunately, the OpenXML library does not ensure the correct order of child elements in the serialized XML as required by the underlying XML schema. Applications may not be able to parse the XML successfully if the order is not correct.

Here is a generic solution which I am using in my code:

private T GetOrCreateWorksheetChildCollection<T>(Spreadsheet.Worksheet worksheet) 
    where T : OpenXmlCompositeElement, new()
{
    T collection = worksheet.GetFirstChild<T>();
    if (collection == null)
    {
        collection = new T();
        if (!worksheet.HasChildren)
        {
            worksheet.AppendChild(collection);
        }
        else
        {
            // compute the positions of all child elements (existing + new collection)
            List<int> schemaPositions = worksheet.ChildElements
                .Select(e => _childElementNames.IndexOf(e.LocalName)).ToList();
            int collectionSchemaPos = _childElementNames.IndexOf(collection.LocalName);
            schemaPositions.Add(collectionSchemaPos);
            schemaPositions = schemaPositions.OrderBy(i => i).ToList();

            // now get the index where the position of the new child is
            int index = schemaPositions.IndexOf(collectionSchemaPos);

            // this is the index to insert the new element
            worksheet.InsertAt(collection, index);
        }
    }
    return collection;
}

// names and order of possible child elements according to the openXML schema
private static readonly List<string> _childElementNames = new List<string>() { 
    "sheetPr", "dimension", "sheetViews", "sheetFormatPr", "cols", "sheetData", 
    "sheetCalcPr", "sheetProtection", "protectedRanges", "scenarios", "autoFilter",
    "sortState", "dataConsolidate", "customSheetViews", "mergeCells", "phoneticPr",
    "conditionalFormatting", "dataValidations", "hyperlinks", "printOptions", 
    "pageMargins", "pageSetup", "headerFooter", "rowBreaks", "colBreaks", 
    "customProperties", "cellWatches", "ignoredErrors", "smartTags", "drawing",
    "drawingHF", "picture", "oleObjects", "controls", "webPublishItems", "tableParts",
    "extLst"
};

The method always inserts the new child element at the correct position, ensuring that the resulting document is valid.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!