Compare two xml and print the difference using LINQ

前端 未结 3 536
我寻月下人不归
我寻月下人不归 2021-01-02 11:38

I am comparing two xml and I have to print the difference. How can I achieve this using LINQ. I know I can use XML diff patch by Microsoft but I prefer to use LINQ . If you

相关标签:
3条回答
  • 2021-01-02 12:23

    Here is the solution:

    //sanitised xmls:
    string s1 = @"<Books>
                     <book id='20504' image='C01' name='C# in Depth'/>
                     <book id='20505' image='C02' name='ASP.NET'/>
                     <book id='20506' image='C03' name='LINQ in Action '/>
                     <book id='20507' image='C04' name='Architecting Applications'/>
                    </Books>";
    string s2 = @"<Books>
                      <book id='20504' image='C011' name='C# in Depth'/>
                      <book id='20505' image='C02' name='ASP.NET 2.0'/>
                      <book id='20506' image='C03' name='LINQ in Action '/>
                      <book id='20508' image='C04' name='Architecting Applications'/>
                    </Books>";
    
    XDocument xml1 = XDocument.Parse(s1);
    XDocument xml2 = XDocument.Parse(s2);
    
    //get cartesian product (i think)
    var result1 =   from xmlBooks1 in xml1.Descendants("book")
                    from xmlBooks2 in xml2.Descendants("book")
                    select new { 
                                book1 = new {
                                            id=xmlBooks1.Attribute("id").Value,
                                            image=xmlBooks1.Attribute("image").Value,
                                            name=xmlBooks1.Attribute("name").Value
                                          }, 
                                book2 = new {
                                            id=xmlBooks2.Attribute("id").Value,
                                            image=xmlBooks2.Attribute("image").Value,
                                            name=xmlBooks2.Attribute("name").Value
                                          } 
                                 };
    
    //get every record that has at least one attribute the same, but not all
    var result2 = from i in result1
                     where (i.book1.id == i.book2.id 
                            || i.book1.image == i.book2.image 
                            || i.book1.name == i.book2.name) &&
                            !(i.book1.id == i.book2.id 
                            && i.book1.image == i.book2.image 
                            && i.book1.name == i.book2.name) 
                     select i;
    
    
    
    foreach (var aa in result2)
    {
        //you do the output :D
    }
    

    Both linq statements probably could be merged, but I leave that as an exercise for you.

    0 讨论(0)
  • 2021-01-02 12:25

    For fun, a general solution to grega g's reading of the problem. To illustrate my objection to this approach, I've introduced a "correct" entry for 'PowerShell in Action'.

    string s1 = @"<Books>
         <book id='20504' image='C01' name='C# in Depth'/>
         <book id='20505' image='C02' name='ASP.NET'/>
         <book id='20506' image='C03' name='LINQ in Action '/>
         <book id='20507' image='C04' name='Architecting Applications'/>
         <book id='20508' image='C05' name='PowerShell in Action'/>
        </Books>";
    string s2 = @"<Books>
         <book id='20504' image='C011' name='C# in Depth'/>
         <book id='20505' image='C02' name='ASP.NET 2.0'/>
         <book id='20506' image='C03' name='LINQ in Action '/>
         <book id='20508' image='C04' name='Architecting Applications'/>
         <book id='20508' image='C05' name='PowerShell in Action'/>
        </Books>";
    
    XDocument xml1 = XDocument.Parse(s1);
    XDocument xml2 = XDocument.Parse(s2);
    
    var res = from b1 in xml1.Descendants("book")
              from b2 in xml2.Descendants("book")
              let issues = from a1 in b1.Attributes()
                           join a2 in b2.Attributes()
                             on a1.Name equals a2.Name
                           select new
                           {
                               Name = a1.Name,
                               Value1 = a1.Value,
                               Value2 = a2.Value
                           }
              where issues.Any(i => i.Value1 == i.Value2)
              from issue in issues
              where issue.Value1 != issue.Value2
              select issue;
    

    Which reports the following:

    { Name = image, Value1 = C01, Value2 = C011 }
    { Name = name, Value1 = ASP.NET, Value2 = ASP.NET 2.0 }
    { Name = id, Value1 = 20507, Value2 = 20508 }
    { Name = image, Value1 = C05, Value2 = C04 }
    { Name = name, Value1 = PowerShell in Action, Value2 = Architecting Applications }
    

    Note that the last two entries are the "conflict" between the 20508 typo and the otherwise correct 20508 entry.

    0 讨论(0)
  • 2021-01-02 12:25

    The operation you want here is a Zip to pair up corresponding elements in your two sequences of books. That operator is being added in .NET 4.0, but we can fake it by using Select to grab the books' indices and joining on that:

    var res = from b1 in xml1.Descendants("book")
                             .Select((b, i) => new { b, i })
              join b2 in xml2.Descendants("book")
                             .Select((b, i) => new { b, i })
                on b1.i equals b2.i
    

    We'll then use a second join to compare the values of attributes by name. Note that this is an inner join; if you did want to include attributes missing from one or the other you would have to do quite a bit more work.

              select new
              {
                  Row = b1.i,
                  Diff = from a1 in b1.b.Attributes()
                         join a2 in b2.b.Attributes()
                           on a1.Name equals a2.Name
                         where a1.Value != a2.Value
                         select new
                         {
                             Name = a1.Name,
                             Value1 = a1.Value,
                             Value2 = a2.Value
                         }
              };
    

    The result will be a nested collection:

    foreach (var b in res)
    {
        Console.WriteLine("Row {0}: ", b.Row);
        foreach (var d in b.Diff)
            Console.WriteLine(d);
    }
    

    Or to get multiple rows per book:

    var report = from r in res
                 from d in r.Diff
                 select new { r.Row, Diff = d };
    
    foreach (var d in report)
        Console.WriteLine(d);
    

    Which reports the following:

    { Row = 0, Diff = { Name = image, Value1 = C01, Value2 = C011 } }
    { Row = 1, Diff = { Name = name, Value1 = ASP.NET, Value2 = ASP.NET 2.0 } }
    { Row = 3, Diff = { Name = id, Value1 = 20507, Value2 = 20508 } }
    
    0 讨论(0)
提交回复
热议问题