I have a data table I\'ve loaded from a CSV file. I need to determine which rows are duplicates based on two columns (product_id
and owner_org_id
) in t
Your criterium is off. You are comparing sets of objects that you are not interested (Except
excludes) in.
Instead, be as clear (data type) as possible and keep it simple:
public bool Equals(DataRow x, DataRow y)
{
// Usually you are dealing with INT keys
return (x["PRODUCT_ID"] as int?) == (y["PRODUCT_ID"] as int?)
&& (x["OWNER_ORG_ID"] as int?) == (y["OWNER_ORG_ID"] as int?);
// If you really are dealing with strings, this is the equivalent:
// return (x["PRODUCT_ID"] as string) == (y["PRODUCT_ID"] as string)
// && (x["OWNER_ORG_ID"] as string) == (y["OWNER_ORG_ID"] as string)
}
Check for null
if that is a possibility. Maybe you want to exclude rows that are equal because their IDs are null.
Observe the int?
. This is not a typo. The question mark is required if you are dealing with database values from columns that can be NULL
. The reason is that NULL
values will be represented by the type DBNull
in C#. Using the as
operator just gives you null
in this case (instead of an InvalidCastException
.
If you are sure, you are dealing with INT NOT NULL
, cast with (int)
.
The same is true for strings. (string)
asserts you are expecting non-null DB values.
EDIT1:
Had the type wrong. ItemArray is not a hashtable. Use the row directly.
EDIT2:
Added string
example, some comment
For a more straight-forward way, check How to select distinct rows in a datatable and store into an array
EDIT3:
Some explanation regarding the casts.
The other link I suggested does the same as your code. I forgot your original intent ;-) I just saw your code and responded to the most obvious error, I saw - sorry
Here is how I would solve the problem
using System.Linq;
using System.Data.Linq;
var q = dtCSV
.AsEnumerable()
.GroupBy(r => new { ProductId = (int)r["PRODUCT_ID"], OwnerOrgId = (int)r["OWNER_ORG_ID"] })
.Where(g => g.Count() > 1).SelectMany(g => g);
var duplicateRows = q.ToList();
I don't know if this 100% correct, I don't have an IDE at hand. And you'll need to adjust the casts to the appropriate type. See my addition above.