Need recommendations on pushing the envelope with SqlBulkCopy on SQL Server

痞子三分冷 提交于 2019-12-04 10:52:50

For recommendations on tuning SQL Server for bulk loads, see the Data Loading and Performance Guide paper from MS, and also Guidelines for Optimising Bulk Import from books online. Although they focus on bulk loading from SQL Server, most of the advice applies to bulk loading using the client API. This papers apply to SQL 2008 - you don't say which SQL Server version you're targetting
Both have quite a lot of information which it's worth going through in detail. However, some highlights:

  • Minimally log the bulk operation. Use bulk-logged or simple recovery. You may need to enable traceflag 610 (but see the caveats on doing this)
  • Tune the batch size
  • Consider partitioning the target table
  • Consider dropping indexes during bulk load

Nicely summarised in this flow chart from Data Loading and Performance Guide:

As others have said, you need to get some peformance counters to establish the source of the bottleneck, since your experiments suggest that IO might not be the limitation. Data Loading and Performance Guide includes a list of SQL wait types and performance counters to monitor (there are no anchors in the document to link to but this is about 75% through the document, in the section "Optimizing Bulk Load")

UPDATE

It took me a while to find the link, but this SQLBits talk by Thomas Kejser is also well worth watching - the slides are available if you don't have time to watch the whole thing. It repeats some of the material linked here but also covers a couple of other suggestions for how to deal with high incidences of particular performance counters.

It seems you have done a lot however I am not sure if you have had chance to study Alberto Ferrari SqlBulkCopy Performance Analysis report, which describes several factors to consider the performance related with SqlBulkCopy. I would say lots of things discussed in that paper is still worth trying to that would good to try first.

I am not sure why you are not getting 100% utilization on CPU, IO or memory. But if you simply want to improve your bulk load speeds, here is something to consider:

  1. Partition you data file into different files. Or if they are coming from different sources, then simply create different data files.
  2. Then run multiple bulk inserts simultaneously.

Depending on your situation the above may not be feasible; but if you can then I am sure it should improve your load speeds.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!