Get Id from a conditional INSERT

拜拜、爱过 提交于 2019-11-30 08:31:50

问题


For a table like this one:

CREATE TABLE Users(
    id SERIAL PRIMARY KEY,
    name TEXT UNIQUE
);

What would be the correct one-query insert for the following operation:

Given a user name, insert a new record and return the new id. But if the name already exists, just return the id.

I am aware of the new syntax within PostgreSQL 9.5 for ON CONFLICT(column) DO UPDATE/NOTHING, but I can't figure out how, if at all, it can help, given that I need the id to be returned.

It seems that RETURNING id and ON CONFLICT do not belong together.


回答1:


The UPSERT implementation is hugely complex to be safe against concurrent write access. Take a look at this Postgres Wiki that served as log during initial development. The Postgres hackers decided not to include "excluded" rows in the RETURNING clause for the first release in Postgres 9.5. They might build something in for the next release.

This is the crucial statement in the manual to explain your situation:

The syntax of the RETURNING list is identical to that of the output list of SELECT. Only rows that were successfully inserted or updated will be returned. For example, if a row was locked but not updated because an ON CONFLICT DO UPDATE ... WHERE clause condition was not satisfied, the row will not be returned.

Bold emphasis mine.

For a single row to insert:

WITH ins AS (
   INSERT INTO users(name)
   VALUES ('new_usr_name')         -- input value
   ON     CONFLICT(name) DO UPDATE
   SET    name = name WHERE FALSE  -- never executed, just to lock row
   RETURNING users.id
   )
SELECT id FROM ins
UNION  ALL
SELECT id FROM users          -- 2nd SELECT never executed if INSERT successful
WHERE  name = 'new_usr_name'  -- input value a 2nd time
LIMIT  1;

Or wrap into a function, to only provide the new name once. Like demonstrated here (also consider the explanation for LIMIT 1):

  • Is SELECT or INSERT in a function prone to race conditions?

The possible race: a concurrent transaction might change / remove the existing row between the INSERT attempt and the SELECT. Highly unlikely, but possible.

If you don't have (possible) concurrent write access (or just don't care), simplify:

...
ON     CONFLICT(name) DO NOTHING
...

To insert a set of rows:

  • How to include excluded rows in RETURNING from INSERT ... ON CONFLICT



回答2:


For a single row insert and no update:

with i as (
    insert into users (name)
    select 'the name'
    where not exists (
        select 1
        from users
        where name = 'the name'
    )
    returning id
)
select id
from users
where name = 'the name'

union all

select id from i

The manual about the primary and the with subqueries parts:

The primary query and the WITH queries are all (notionally) executed at the same time

Although that sounds to me "same snapshot" I'm not sure since I don't know what notionally means in that context.

But there is also:

The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot

If I understand correctly that same snapshot bit prevents a race condition. But again I'm not sure if by all the statements it refers only to the statements in the with subqueries excluding the main query. To avoid any doubt move the select in the previous query to a with subquery:

with s as (
    select id
    from users
    where name = 'the name'
), i as (
    insert into users (name)
    select 'the name'
    where not exists (select 1 from s)
    returning id
)
select id from s
union all
select id from i


来源:https://stackoverflow.com/questions/36083669/get-id-from-a-conditional-insert

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!