Why are batch inserts/updates faster? How do batch updates work?

后端 未结 4 841
长情又很酷
长情又很酷 2020-11-30 21:11

Why are batch inserts faster? Is it because the connection and setup overhead for inserting a single row is the same for a set of rows? What other factors make batch inserts

4条回答
  •  庸人自扰
    2020-11-30 21:51

    In a batch updates, the database works against a set of data, in a row by row update it has to run the same command as may times as there are rows. So if you insert a million rows in a batch, the command is sent and processed once and in a row-by row update, it is sent and processed a million times. This is also why you never want to use a cursor in SQL Server or a correlated subquery.

    an example of a set-based update in SQL server:

    update mytable
    set myfield = 'test'
    where myfield is null
    

    This would update all 1 million records that are null in one step. A cursor update (which is how you would update a million rows in a non-batch fashion) would iterate through each row one a time and update it.

    The problem with a batch insert is the size of the batch. If you try to update too many records at once, the database may lock the table for the duration of the process, locking all other users out. So you may need to do a loop that takes only part of the batch at a time (but pretty much any number greater than one row at time will be faster than one row at a time) This is slower than updating or inserting or deleting the whole batch, but faster than row-by row operations and may be needed in a production environment with many users and little available downtime when users are not trying to see and update other records in the same table. The size of the batch depends greatly on the database structure and exactly what is happening (tables with triggers and lots of constraints are slower as are tables with lots of fields and so require smaller batches).

提交回复
热议问题