How to remove duplicate records from excel sheet using oledb

允我心安 提交于 2019-12-12 04:14:17

问题


I have an excel sheet with 4 column(JobCode,JobName,StartDate,EndDate). On the basis of one rule I have to validate the 1st excel sheet and insert all the record in 2nd excel sheet except the duplicate record which is present in 1st excel sheet. I tried to using list. But it's working as expected.

List<string> JobCodeList = new List<string>();
for (int iRowCount = 0; iRowCount < hrms_jobdata.Tables[0].Rows.Count; iRowCount++)
{
    JobCode = hrms_jobdata.Tables[0].Rows[iRowCount]["Job Code"].ToString();
    JobName = hrms_jobdata.Tables[0].Rows[iRowCount]["Job Name"].ToString();
    StartDate = hrms_jobdata.Tables[0].Rows[iRowCount]["Start Date"].ToString();
    EndDate = hrms_jobdata.Tables[0].Rows[iRowCount]["End Date"].ToString();
    JobCodeList.Add(JobCode + JobName);
}

connectionhrms_job.Close();


for (int iRowCount = 0; iRowCount < hrms_jobdata.Tables[0].Rows.Count; iRowCount++)
{
    JobCode = hrms_jobdata.Tables[0].Rows[iRowCount]["Job Code"].ToString();
    JobName = hrms_jobdata.Tables[0].Rows[iRowCount]["Job Name"].ToString();
    StartDate = hrms_jobdata.Tables[0].Rows[iRowCount]["Start Date"].ToString();
    EndDate = hrms_jobdata.Tables[0].Rows[iRowCount]["End Date"].ToString();

    DateTime convertedstart = DateTime.Parse(StartDate);
    StartDateFormated = convertedstart.ToString("dd-MM-yyyy");

    DateTime convertedend = DateTime.Parse(EndDate);
    EndDateFormated = convertedend.ToString("dd-MM-yyyy");

    List<string> dupvalue = removeDuplicates(JobCodeList);

    foreach (string value in dupvalue)
    {
        string jobcodename = value; 
    }

    string connectionStringdest = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + pathdestination + ";Extended Properties=Excel 12.0;";
    DbProviderFactory factorydest = DbProviderFactories.GetFactory("System.Data.OleDb");
    DbConnection connectiondest = factorydest.CreateConnection();
    connectiondest.ConnectionString = connectionStringdest;
    DbCommand command = connectiondest.CreateCommand();
    StringBuilder inserthrms_job = new StringBuilder();
    inserthrms_job = inserthrms_job.Append("Insert into [hrms_job$] values ('" + JobCode + "', '" + JobName + "', '" + StartDateFormated + "', '" + EndDateFormated + "','" + JobCode + " " + JobName + "') ");
    inserthrms_job = inserthrms_job.Append(";");
    command.CommandText = inserthrms_job.ToString();
    connectiondest.Open();
    command.ExecuteNonQuery();
    connectiondest.Close();
}

回答1:


When you query the source spreadsheet, just do a "Select top 1 field1, field2, field3 from [Sheet$] Group by field1, field2, field3". That way you only read the first record, not the duplicates.




回答2:


Good question - I don't think it's possible via the SELECT statement in the oledbadapter per the one answer I see (if so, please advise).

See this OleDBAdapter Excel QA I posted via stack overflow.

Put your into DataSet into an object like I did at the bottom of the post.

Then extend your object via the enumerable.distinct (see MSDN example), so you your object returns distinct elements from a sequence by using the default equality comparer to compare values.

then at the bottom of that post:

var noduplicates = query.Distinct();
foreach (var rec in noduplicates)
    Console.WriteLine(rec.ManagedLocationID + " " + rec.PartID + " " + rec.Quantity);


来源:https://stackoverflow.com/questions/4660768/how-to-remove-duplicate-records-from-excel-sheet-using-oledb

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!