C# Efficiently delete 50000 records in batches using SQLBulkCopy or equivalent library

本秂侑毒 提交于 2020-06-25 18:52:51

问题


I'm using this library to perform bulk delete in batches like following:

  while (castedEndedItems.Any())
  {
    var subList = castedEndedItems.Take(4000).ToList();
    DBRetry.Do(() => EFBatchOperation.For(ctx, ctx.SearchedUserItems).Where(r => subList.Any(a => a == r.ItemID)).Delete(), TimeSpan.FromSeconds(2));
    castedEndedItems.RemoveRange(0, subList.Count);
    Console.WriteLine("Completed a batch of ended items");
  }

As you can see guys I take a batch of 4000 items to delete at once and I pass them as argument to the query...

I'm using this library to perform bulk delete:

https://github.com/MikaelEliasson/EntityFramework.Utilities

However the performance like this is absolutely terrible... I tested the application couple of times and to delete the 80000 records for example it takes literally 40 minutes!?

I should note that that parameter by which I'm deleting (ItemID) is of varchar(400) type and it's indexed for performance reasons....

Is there any other library that I could possibly use or tweak this query to make it work faster, because currently the performance is absolutely terrible.. :/


回答1:


If you are prepared to use a stored procedure then you can do this without any external library:

  • Create the sproc using a table valued parameter @ids
  • Define a SQL type for that table valued parameter (just an id column assuming a simple PK)
  • In the sproc use

    delete from table where id in (select id from @ids);
    
  • In your application create a DataTable and populate to match the SQL table

  • Pass the data table as an command parameter when calling the sproc.

This answer illustrates the process.

Any other option will need to do the equivalent of this – or something less efficient.




回答2:


any EF solution here is probably going to perform lots of discreet operations. Instead, I would suggest manually building your SQL in a loop, something like:

using(var cmd = db.CreateCommand())
{
    int index = 0;
    var sql = new StringBuilder("delete from [SomeTable] where [SomeId] in (");
    foreach(var item in items)
    {
        if (index != 0) sql.Append(',');
        var name = "@id_" + index++;
        sql.Append(name);
        cmd.Parameters.AddWithValue(name, item.SomeId);            
    }
    cmd.CommandText = sql.Append(");").ToString();
    cmd.ExecuteNonQuery();
}

You may need to loop this in batches, though, as there is an upper limit on the number of parameters allowed on a command.




回答3:


If you don't mind the extra dependency, you could use the NuGet package Z.EntityFramework.Plus.

The code is roughly as follows:

using Z.EntityFramework.Plus;
[...]
         using (yourDbContext context = new yourDbContext())
         {
              yourDbContext.yourDbSet.Where( yourWhereExpression ).Delete();
         }

It is simple and efficient. The documentation contains exact numbers about the performance.

Regarding licensing: As far as I know, version 1.8 has an MIT license: https://github.com/zzzprojects/EntityFramework-Plus/blob/master/LICENSE The newer version are not free to use.



来源:https://stackoverflow.com/questions/54403944/c-sharp-efficiently-delete-50000-records-in-batches-using-sqlbulkcopy-or-equival

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!