问题
I have a question, want to join 2 tables with same column. Table 1 has Name, LastName Columns and many other columns, Table 2 has Name, Comment and many other Columns. I want to join them with Name column and as Result should be Name, LastName, Comment and other Columns. I tried with outer left Linq but don't know how to write select new because don't know how many other columns i have.
My Table 1:
Name1 LastName ...
Niki Row ...
Hube Slang ...
Koke Mi ...
... ... ...
... ... ...
Table 2:
Name Comment ...
Koke "Hello" ...
Niki "Hi" ...
Result should be:
Name LastName Comment ...
Niki Row "Hi" ...
Hube Sland ...
Koke Mi "Hello"...
... ... ...
So i tried to concat the rows with each others. But it said that the array from table1 is longer than array from table 2. Is there another way to join it?
foreach (DataRow tbE in Table1.Rows)
{
foreach (DataRow opT in Table2.Rows)
{
if (tbE["Name"].ToString() == opT["Name"].ToString())
{
var row = Result.NewRow();
row.ItemArray = tbE.ItemArray
.Concat(opT.ItemArray).ToArray();
Result.Rows.Add(row);
}
else
Result.ImportRow(tbE);
}
}
Result.Columns.Remove(Name);
回答1:
You could use this method here which i've written from scratch recently for SO for another question(so it's not really tested). It allows to merge multiple tables by a common key. If no key is specified it will just use the default DataTable.Merge method:
public static DataTable MergeAll(this IList<DataTable> tables, String primaryKeyColumn)
{
if (!tables.Any())
throw new ArgumentException("Tables must not be empty", "tables");
if(primaryKeyColumn != null)
foreach(DataTable t in tables)
if(!t.Columns.Contains(primaryKeyColumn))
throw new ArgumentException("All tables must have the specified primarykey column " + primaryKeyColumn, "primaryKeyColumn");
if(tables.Count == 1)
return tables[0];
DataTable table = new DataTable("TblUnion");
table.BeginLoadData(); // Turns off notifications, index maintenance, and constraints while loading data
foreach (DataTable t in tables)
{
table.Merge(t); // same as table.Merge(t, false, MissingSchemaAction.Add);
}
table.EndLoadData();
if (primaryKeyColumn != null)
{
// since we might have no real primary keys defined, the rows now might have repeating fields
// so now we're going to "join" these rows ...
var pkGroups = table.AsEnumerable()
.GroupBy(r => r[primaryKeyColumn]);
var dupGroups = pkGroups.Where(g => g.Count() > 1);
foreach (var grpDup in dupGroups)
{
// use first row and modify it
DataRow firstRow = grpDup.First();
foreach (DataColumn c in table.Columns)
{
if (firstRow.IsNull(c))
{
DataRow firstNotNullRow = grpDup.Skip(1).FirstOrDefault(r => !r.IsNull(c));
if (firstNotNullRow != null)
firstRow[c] = firstNotNullRow[c];
}
}
// remove all but first row
var rowsToRemove = grpDup.Skip(1);
foreach(DataRow rowToRemove in rowsToRemove)
table.Rows.Remove(rowToRemove);
}
}
return table;
}
You can call it in this way:
var tables = new[] { Table1, Table2 };
tables.MergeAll("Name");
Edit: here's the screenshot from the debugger with your sample-data:

So it works :)
Sample data and test here:
var Table1 = new DataTable();
var Table2 = new DataTable();
Table1.Columns.Add("Name");
Table1.Columns.Add("LastName");
Table2.Columns.Add("Name");
Table2.Columns.Add("Comment");
Table1.Rows.Add("Niki", "Row");
Table1.Rows.Add("Hube", "Slang");
Table1.Rows.Add("Koke", "Mi");
Table2.Rows.Add("Koke", "Hello");
Table2.Rows.Add("Niki", "Hi");
var tables = new DataTable[] { Table1, Table2 };
DataTable merged = tables.MergeAll("Name");
回答2:
Here is a bit of my contribution. This partial code can be used to Join any two DataTables on specified column names. (You do not need to know the rest of the columns) Here are some of the features:
- The resultant DataTable will not have duplicate columns for those used in join. e.g. if you join on "Name" column, you will have only one "Name" column in the end, instead of one copy from each table.
- In case of duplicate columns that are NOT used in join, the duplicate column in the second table will be renamed by appending "_2" to the end. It can behave in other ways, just change that part of code.
- Multiple Join columns are supported. For this purpose, a JoinKey class is created for them to be comparable by LINQ.
This code is kind of a mixture of code I found online and my trial and error. I am new to LINQ so feel free to critique~
public class JoinKey { List<object> objects { get; set; } public JoinKey(List<object> objects) { this.objects = objects; } public override bool Equals(object obj) { if (obj == null || obj.GetType() != typeof(JoinKey)) return false; return objects.SequenceEqual(((JoinKey)obj).objects); } public override int GetHashCode() { int hash = 0; foreach (var foo in objects) { hash = hash * 31 + foo.GetHashCode(); } return hash; } } public enum JoinType { Inner = 0, Left = 1 } //Joins two tables and spits out the joined new DataTable. Tables are joined on onCol column names //If the right table has column name clashes with the left column, the column names will be appended "_2" and added to joined table public static DataTable Join(DataTable left, DataTable right, JoinType joinType, params string[] onCol) { Func<DataRow, object> getKey = (row) => { return new JoinKey(onCol.Select(str => row[str]).ToList()); }; var dt = new DataTable(left.TableName); var colNumbersToRemove = new List<int>(); //Populate the columns foreach (DataColumn col in left.Columns) { if (dt.Columns[col.ColumnName] == null) dt.Columns.Add(new DataColumn(col.ColumnName, col.DataType, col.Expression, col.ColumnMapping)); } for (int colIdx = 0; colIdx < right.Columns.Count; ++colIdx) { var col = right.Columns[colIdx]; //if this is joined column, it will be removed. if (onCol.Contains(col.ColumnName)) { colNumbersToRemove.Add(colIdx); } else { //if this is duplicate column, it will be renamed. if (dt.Columns[col.ColumnName] != null) { col.ColumnName += "_2"; } dt.Columns.Add(new DataColumn(col.ColumnName, col.DataType, col.Expression, col.ColumnMapping)); } } if (joinType == JoinType.Left) { var res = from l in left.AsEnumerable() join r in right.AsEnumerable() on getKey(l) equals getKey(r) into temp from r in temp.DefaultIfEmpty() select l.ItemArray.Concat(((r == null) ? (right.NewRow().ItemArray) : r.ItemArray).Minus(colNumbersToRemove)).ToArray(); foreach (object[] values in res) dt.Rows.Add(values); } else { //Inner Join var res = from l in left.AsEnumerable() join r in right.AsEnumerable() on getKey(l) equals getKey(r) into temp from r in temp select l.ItemArray.Concat(((r == null) ? (right.NewRow().ItemArray) : r.ItemArray).Minus(colNumbersToRemove)).ToArray(); foreach (object[] values in res) dt.Rows.Add(values); } return dt; }
来源:https://stackoverflow.com/questions/13156626/join-2-datatables-with-many-columns