Why are batch inserts faster? Is it because the connection and setup overhead for inserting a single row is the same for a set of rows? What other factors make batch inserts
Why are batch inserts faster?
For numerous reasons, but the major three are these:
Is it because the connection and setup overhead for inserting a single row is the same for a set of rows?
Partially yes, see above.
How do batch updates work?
This depends on RDBMS.
In Oracle you can transmit all values as a collection and use this collection as a table in a JOIN.
In PostgreSQL and MySQL, you can use the following syntax:
INSERT
INTO mytable
VALUES
(value1),
(value2),
…
You can also prepare a query once and call it in some kind of a loop. Usually there are methods to do this in a client library.
Assuming the table has no uniqueness constraints, insert statements don't really have any effect on other insert statements in the batch. But, during batch updates, an update can alter the state of the table and hence can affect the outcome of other update queries in the batch.
Yes, and you may or may not benefit from this behavior.
I know that batch insert queries have a syntax where you have all the insert values in one big query. How do batch update queries look like?
In Oracle, you use collection in a join:
MERGE
INTO mytable
USING TABLE(:mycol)
ON …
WHEN MATCHED THEN
UPDATE
SET …
In PostgreSQL:
UPDATE mytable
SET s.s_start = 1
FROM (
VALUES
(value1),
(value2),
…
) q
WHERE …