问题
I have two XML, before and after the user has edited them. I need to check that user have only added new elements but have not deleted or changed old ones.
Can anybody suggest to me a good algorithm to do that comparison?
Ps: My XML has a very trivial schema, they only represent an object's structure (with nested objects) in a naive way. There are few allowed tags, <object> tag can only contains <name> tag, <type> tag or a <list> tag. The <name> and <type> tag can only contain a string; <list> tag instead can contain a <name> tag and a single <object> tags (representing the structure of objects in the list). The string in the <name> tag can be freely choosen, the string in <type> tag instead can be only "string" , "int" , "float" , "bool" , "date" or "composite".
Here an example :
<object>
<name>Person</name>
<type>composite</type>
<object>
<name>Person_Name</name>
<type>string</type>
</object>
<object>
<name>Person_Surname</name>
<type>string</type>
</object>
<object>
<name>Person_Age</name>
<type>int</type>
</object>
<object>
<name>Person_Weight</name>
<type>float</type>
</object>
<object>
<name>Person_Address</name>
<type>string</type>
</object>
<object>
<name>Person_BirthDate</name>
<type>date</type>
</object>
<list>
<name>Person_PhoneNumbers</name>
<object>
<name>Person_PhoneNumber</name>
<type>composite</type>
<object>
<name>Person_PhoneNumber_ProfileName</name>
<type>string</type>
</object>
<object>
<name>Person_PhoneNumber_CellNumber</name>
<type>string</type>
</object>
<object>
<name>Person_PhoneNumber_HomeNumber</name>
<type>string</type>
</object>
<object>
<name>Person_PhoneNumber_FaxNumber</name>
<type>string</type>
</object>
<object>
<name>Person_PhoneNumber_Mail</name>
<type>string</type>
</object>
<object>
<name>Person_PhoneNumber_Social</name>
<type>string</type>
</object>
<object>
<name>Person_PhoneNumber_IsActive</name>
<type>bool</type>
</object>
</object>
</list>
</object>
回答1:
You said:
I need to check that user have only added new elements
but have not deleted or changed old ones.
Can you be more precise about what you mean?
For example, if I insert a new "object" element somewhere, I've changed every element it's inside of, right? As many lists and other objects as contain it. In fact, any insertion at all is a change to the root element.
So, presumably you want to not count changes that change nothing but the root element. How about adding a new item to the list you show? Do you want the list to count as changed? Or what if the objects in the list, or the list itself, are moved to new places without having their content changed at all?
Each of those possibilities is pretty easy to write, but one has to decide what counts as a change first.
If, for example, you only care about bottom-level objects, and "the same" means precisely the same text content (no attributes, white-space variations, etc. etc.), then the easiest way is to load the "before" file into a list of (name,type) pairs; then load the "after" file into a similar but separate list. Sort both lists, then run down them simultaneously and report anything in the new one that's not in the old one (you'll probably want to report any deletions too, just in case).
回答2:
I need to check that user have only added new elements but have not deleted or changed old ones.
You can represent your 2 XML files as objects. Traverse the nodes, get child the element count for each node and check if its child nodes exists on the other file. For comparing 2 complex objects, you can use the IEquatable.Equals()
interface method. Read it here.
The code below doesn't care about the structure of your XML document or on which position a particular element exists since each element is represented as an XElement
object. All it knows is 1.) the name of the element, 2.) that each element has children or not, 3.) has attributes or not, 4.) has innerxml or not, etc. If you want to be strict about the structure of your XML, you can represent each level as a single class.
public class Program
{
static void Main(string[] args)
{
XDocument xdoc1 = XDocument.Load("file1.xml");
XDocument xdoc2 = XDocument.Load("file2.xml");
RootElement file1 = new RootElement(xdoc1.Elements().First());
RootElement file2 = new RootElement(xdoc2.Elements().First());
bool isEqual = file1.Equals(file2);
Console.ReadLine();
}
}
public abstract class ElementBase<T>
{
public string Name;
public List<T> ChildElements;
public ElementBase(XElement xElement)
{
}
}
public class RootElement : ElementBase<ChildElement>, IEquatable<RootElement>
{
public RootElement(XElement xElement)
: base(xElement)
{
ChildElements = new List<ChildElement>();
Name = xElement.Name.ToString();
foreach (XElement e in xElement.Elements())
{
ChildElements.Add(new ChildElement(e));
}
}
public bool Equals(RootElement other)
{
bool flag = true;
if (this.ChildElements.Count != other.ChildElements.Count())
{
//--Your error handling logic here
flag = false;
}
List<ChildElement> otherChildElements = other.ChildElements;
foreach (ChildElement c in this.ChildElements)
{
ChildElement otherElement = otherChildElements.FirstOrDefault(x => x.Name == c.Name);
if (otherElement == null)
{
//--Your error handling logic here
flag = false;
}
else
{
flag = c.Equals(otherElement) == false ? false : flag;
}
}
return flag;
}
}
public class ChildElement : ElementBase<ChildElement>, IEquatable<ChildElement>
{
public ChildElement(XElement xElement)
: base(xElement)
{
ChildElements = new List<ChildElement>();
Name = xElement.Name.ToString();
foreach (XElement e in xElement.Elements())
{
ChildElements.Add(new ChildElement(e));
}
}
public bool Equals(ChildElement other)
{
bool flag = true;
if (this.ChildElements.Count != other.ChildElements.Count())
{
//--Your error handling logic here
flag = false;
}
List<ChildElement> otherList = other.ChildElements;
foreach (ChildElement e in this.ChildElements)
{
ChildElement otherElement = otherList.FirstOrDefault(x => x.Name == e.Name);
if (otherElement == null)
{
//--Your error handling logic here
flag = false;
}
else
{
flag = e.Equals(otherElement) == false ? false : flag;
}
}
return flag;
}
}
If you also want to check for attributes or innerxml, you can do like so.
public List<XAttribute> ElementAttributes = new List<XAttribute>();
foreach (XAttribute attr in xElement.Attributes())
{
ElementAttributes.Add(attr);
}
List<XAttribute> otherAttributes = other.ElementAttributes;
foreach (XAttribute attr in ElementAttributes)
{
XAttribute otherAttribute = otherAttributes.FirstOrDefault(x => x.Name == attr.Name);
if (otherAttribute == null)
{
//--Your error handling logic here
flag = false;
}
else
{
if (otherAttribute.Value != attr.Value)
{
//--Your error handling logic here
flag = false;
}
}
}
来源:https://stackoverflow.com/questions/28820209/c-sharp-xml-diffing-algorithm