Return pre-UPDATE column values using SQL only

后端 未结 4 1958
萌比男神i
萌比男神i 2020-11-27 03:25

I posted a related question, but this is another part of my puzzle.

I would like to get the OLD value of a column from a row that was UPDATEd - WITHOUT using triggers

相关标签:
4条回答
  • 2020-11-27 03:34

    Problem

    The manual explains:

    The optional RETURNING clause causes UPDATE to compute and return value(s) based on each row actually updated. Any expression using the table's columns, and/or columns of other tables mentioned in FROM, can be computed. The new (post-update) values of the table's columns are used. The syntax of the RETURNING list is identical to that of the output list of SELECT.

    Bold emphasis mine. There is no way to access the old row in a RETURNING clause. You work around this restriction with a trigger or with a separate SELECT before the UPDATE, wrapped in a transaction as @Flimzy and @wildplasser commented, or wrapped in a CTE as @MattDiPasquale posted.

    Solution without concurrent writes

    However, what you are trying to achieve works perfectly fine if you join in another instance of the table in the FROM clause:

    UPDATE tbl x
    SET    tbl_id = 23
         , name = 'New Guy'
    FROM   tbl y                -- using the FROM clause
    WHERE  x.tbl_id = y.tbl_id  -- must be UNIQUE NOT NULL
    AND    x.tbl_id = 3
    RETURNING y.tbl_id AS old_id, y.name AS old_name
            , x.tbl_id          , x.name;
    

    Returns:

     old_id | old_name | tbl_id |  name
    --------+----------+--------+---------
      3     | Old Guy  | 23     | New Guy
    

    The column(s) used to self-join must be UNIQUE NOT NULL. In the simple example, the WHERE condition is on the same column tbl_id, but that's just coincidence. Works for any conditions.

    I tested this with PostgreSQL versions from 8.4 to 13.

    It's different for INSERT:

    • INSERT INTO ... FROM SELECT ... RETURNING id mappings

    Solutions with concurrent write load

    There are various ways to avoid race conditions with concurrent write operations on the same rows. (Note that concurrent write operations on unrelated rows are no problem at all.) The simple, slow and sure (but expensive) method is to run the transaction with SERIALIZABLE isolation level:

    BEGIN ISOLATION LEVEL SERIALIZABLE;
    UPDATE ... ;
    COMMIT;
    

    But that's probably overkill. And you need to be prepared to repeat the operation in case of a serialization failure.

    Simpler and faster (and just as reliable with concurrent write load) is an explicit lock on the one row to be updated:

    UPDATE tbl x
    SET    tbl_id = 24
         , name = 'New Gal'
    FROM  (SELECT tbl_id, name FROM tbl WHERE tbl_id = 4 FOR UPDATE) y 
    WHERE  x.tbl_id = y.tbl_id
    RETURNING y.tbl_id AS old_id, y.name AS old_name
            , x.tbl_id          , x.name;
    

    Note how the WHERE condition moved to the subquery (again, can be anything), and only the self-join (on UNIQUE NOT NULL column(s)) remains in the outer query. This guarantees that only rows locked by the inner SELECT are processed. The WHERE conditions might resolve to a different set of rows a moment later.

    See:

    • Atomic UPDATE .. SELECT in Postgres

    db<>fiddle here
    Old sqlfiddle

    0 讨论(0)
  • 2020-11-27 03:44

    You can use a SELECT subquery.

    Example: Update a user's email RETURNING the old value.

    1. RETURNING Subquery

      UPDATE users SET email = 'new@gmail.com' WHERE id = 1
      RETURNING (SELECT email FROM users WHERE id = 1);
      
    2. PostgreSQL WITH Query (Common Table Expressions)

      WITH u AS (
          SELECT email FROM users WHERE id = 1
      )
      UPDATE users SET email = 'new@gmail.com' WHERE id = 1
      RETURNING (SELECT email FROM u);
      

      This has worked several times on my local database without fail, but I'm not sure if the SELECT in WITH is guaranteed to consistently execute before the UPDATE since "the sub-statements in WITH are executed concurrently with each other and with the main query."

    0 讨论(0)
  • 2020-11-27 03:54

    The CTE variant as proposed by @MattDiPasquale should work too.
    With the comfortable means of a CTE I would be more explicit, though:

    WITH sel AS (
       SELECT tbl_id, name FROM tbl WHERE tbl_id = 3  -- assuming unique tbl_id
       )
    , upd AS (
       UPDATE tbl SET name = 'New Guy' WHERE tbl_id = 3
       RETURNING tbl_id, name
       )
    SELECT s.tbl_id AS old_id, s.name As old_name
         , u.tbl_id, u.name
    FROM   sel s, upd u;
    

    Without testing I claim this works: SELECT and UPDATE see the same snapshot of the database. The SELECT is bound to return the old values (even if you place the CTE after the CTE with the UPDATE), while the UPDATE returns the new values by definition. Voilá.

    But it will be slower than my first answer.

    0 讨论(0)
  • 2020-11-27 03:54

    when faced with this dilemma I added junk columns to the table and then I copy the old values into the junk columns (which I then return) when I update the record. this bloats the table a bit but avoids the need for joins.

    0 讨论(0)
提交回复
热议问题