How to guarantee atomic SQL inserts with subqueries?

问题

Given a simplified table structure like this:

 CREATE TABLE t1 (
        id INT,
        num INT,
        CONSTRAINT t1_pk
        PRIMARY KEY (id),
        CONSTRAINT t1_uk
        UNIQUE (id, num)
    )

Can I use a subquery like this for inserting records without causing a race condition?

INSERT INTO t1 (
    id,
    num
) VALUES (
    1,
    (
        SELECT MAX(num) + 1
        FROM   t1
    )
)

Or are subqueries not atomic? I'm worried about simultaneous INSERTs grabbing the same value for num and then causing a unique constraint violation.

回答1:

Yes, this can most certainly create a race condition, because while all statements are guaranteed atomic, this does not require them to have operated across an unchanging data set during the separate parts of the query's execution.

A client submits your above query. So long as the engine finds the MAX(num) while holding only locks that are compatible with other readers, then another client can find the same MAX(num) before the INSERT is performed.

There are four ways around this problem that I know of:

Use a sequence. In the INSERT you can just do sequencename.nextval to return the next unique number to be inserted.

SQL> create sequence t1num;

Sequence created.

SQL> select t1num.nextval from dual;

   NEXTVAL
----------
         1

SQL> select t1num.nextval from dual;

   NEXTVAL
----------
         2

Retry on failure. I read a credible article about a very high transactions-per-second system that had a scenario not exactly like this one but suffering from the same race condition of the INSERT possibly using the wrong value. They found that the highest TPS was achieved by simply--having given num a unique constraint--if the INSERT was rejected due to a violation of the unique constraint, the client would simply retry.
Add a locking hint that forces the engine to block other readers until the INSERT is completed. While this may be easy, it may or may not be suitable for high concurrency. If the MAX() is performed with a single seek, and the blocking is not long and does not block many clients, it could be perfectly acceptable.
Use a separate one-row helper table to record the next/most recent value for num. Perform an UPDATE on the helper table, simultaneously pulling out the value, then use this separately to INSERT to the main table. In my mind, while this has some annoyance of not being a single query, plus, it does have the issue that if the client manages to "reserve" a value of num, but then fails for any reason to actually perform the INSERT, then a gap can occur in the values of num in the table.

回答2:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
INSERT INTO t1 (id, num) VALUES (1, (SELECT MAX(num) + 1 FROM t1));
COMMIT;

LOCK TABLE t1 IN EXCLUSIVE MODE;
INSERT INTO t1 (id, num) VALUES (1, (SELECT MAX(num) + 1 FROM t1));
COMMIT;

both causing performance issues for simultaneous processes doing the same operation. But if a guaranteed gap-less sequence is a requirement then this is the cost.

回答3:

DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;

Al subqueries should be evaluated as if the were snapshots taken at query-start. Works without additional measures in Postgres:

CREATE TABLE hopla
        ( the_id SERIAL NOT NULL PRIMARY KEY
        , tralala varchar
        );

INSERT INTO hopla(tralala)
SELECT 'tralala_' || gs::text
FROM generate_series(1,4) gs
        ;

SELECT * FROM hopla;
INSERT INTO hopla(the_id, tralala)
SELECT mx.mx + row_number() OVER (ORDER BY org.the_id)
        , org.tralala
FROM hopla org
, (SELECT MAX(the_id) AS mx FROM hopla) mx
        ;

SELECT * FROM hopla;

Result/output:

CREATE TABLE
INSERT 0 4
 the_id |  tralala  
--------+-----------
      1 | tralala_1
      2 | tralala_2
      3 | tralala_3
      4 | tralala_4
(4 rows)

INSERT 0 4
 the_id |  tralala  
--------+-----------
      1 | tralala_1
      2 | tralala_2
      3 | tralala_3
      4 | tralala_4
      5 | tralala_1
      6 | tralala_2
      7 | tralala_3
      8 | tralala_4
(8 rows)

来源：https://stackoverflow.com/questions/15750301/how-to-guarantee-atomic-sql-inserts-with-subqueries

标签

sql

Oracle

race-condition