Determining duplicates in a datatable

前端 未结 2 802
走了就别回头了
走了就别回头了 2021-01-25 03:22

I have a data table I\'ve loaded from a CSV file. I need to determine which rows are duplicates based on two columns (product_id and owner_org_id) in t

2条回答
  •  Happy的楠姐
    2021-01-25 03:44

    You could use LINQ-To-DataSet and Enumerable.Except/Intersect:

    var tbl1ID = tbl1.AsEnumerable()
            .Select(r => new
            {
                product_id = r.Field("product_id"),
                owner_org_id = r.Field("owner_org_id"),
            });
    var tbl2ID = tbl2.AsEnumerable()
            .Select(r => new
            {
                product_id = r.Field("product_id"),
                owner_org_id = r.Field("owner_org_id"),
            });
    
    
    var unique = tbl1ID.Except(tbl2ID);
    var both = tbl1ID.Intersect(tbl2ID);
    
    var tblUnique = (from uniqueRow in unique
                    join row in tbl1.AsEnumerable()
                    on uniqueRow equals new
                    {
                        product_id = row.Field("product_id"),
                        owner_org_id = row.Field("owner_org_id")
                    }
                    select row).CopyToDataTable();
    var tblBoth = (from bothRow in both
                  join row in tbl1.AsEnumerable()
                  on bothRow equals new
                  {
                      product_id = row.Field("product_id"),
                      owner_org_id = row.Field("owner_org_id")
                  }
                  select row).CopyToDataTable();
    

    Edit: Obviously i've misunderstood your requirement a little bit. So you only have one DataTable and want to get all unique and all duplicate rows, that's even more straight-forward. You can use Enumerable.GroupBy with an anonymous type containing both fields:

    var groups = tbl1.AsEnumerable()
        .GroupBy(r => new
        {
            product_id = r.Field("product_id"),
            owner_org_id = r.Field("owner_org_id")
        });
    var tblUniques = groups
        .Where(grp => grp.Count() == 1)
        .Select(grp => grp.Single())
        .CopyToDataTable();
    var tblDuplicates = groups
        .Where(grp => grp.Count() > 1)
        .SelectMany(grp => grp)
        .CopyToDataTable();
    

提交回复
热议问题