Postgres UPSERT (INSERT or UPDATE) only if value is different

青春壹個敷衍的年華 提交于 2019-12-03 12:49:50

问题


I'm updating a Postgres 8.4 database (from C# code) and the basic task is simple enough: either UPDATE an existing row or INSERT a new one if one doesn't exist yet. Normally I would do this:

UPDATE my_table
SET value1 = :newvalue1, ..., updated_time = now(), updated_username = 'evgeny'
WHERE criteria1 = :criteria1 AND criteria2 = :criteria2

and if 0 rows were affected then do an INSERT:

INSERT INTO my_table(criteria1, criteria2, value1, ...)
VALUES (:criteria1, :criteria2, :newvalue1, ...)

There is a slight twist, though. I don't want to change the updated_time and updated_username columns unless any of the new values are actually different from the existing values to avoid misleading users about when the data was updated.

If I was only doing an UPDATE then I could add WHERE conditions for the values as well, but that won't work here, because if the DB is already up to date the UPDATE will affect 0 rows and then I would try to INSERT.

Can anyone think of an elegant way to do this, other than SELECT, then either UPDATE or INSERT?


回答1:


Take a look at a BEFORE UPDATE trigger to check and set the correct values:

CREATE OR REPLACE FUNCTION my_trigger() RETURNS TRIGGER LANGUAGE plpgsql AS
$$
BEGIN
    IF OLD.content = NEW.content THEN
        NEW.updated_time= OLD.updated_time; -- use the old value, not a new one.
    ELSE
        NEW.updated_time= NOW();
    END IF;
    RETURN NEW;
END;
$$;

Now you don't even have to mention the field updated_time in your UPDATE query, it will be handled by the trigger.

http://www.postgresql.org/docs/current/interactive/plpgsql-trigger.html




回答2:


Two things here. Firstly depending on activity levels in your database you may hit a race condition between checking for a record and inserting it where another process may create that record in the interim. The manual contains an example of how to do this link example

To avoid doing an update there is the suppress_redundant_updates_trigger() procedure. To use this as you wish you wold have to have two before update triggers the first will call the suppress_redundant_updates_trigger() to abort the update if no change made and the second to set the timestamp and username if the update is made. Triggers are fired in alphabetical order. Doing this would also mean changing the code in the example above to try the insert first before the update.

Example of how suppress update works:

    DROP TABLE sru_test;

    CREATE TABLE sru_test(id integer not null primary key,
    data text,
    updated timestamp(3));

    CREATE TRIGGER z_min_update
    BEFORE UPDATE ON sru_test
    FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();

    DROP FUNCTION set_updated();

    CREATE FUNCTION set_updated()
    RETURNS TRIGGER
    AS $$
    DECLARE
    BEGIN
        NEW.updated := now();
        RETURN NEW;
    END;
    $$ LANGUAGE plpgsql;

    CREATE TRIGGER zz_set_updated
    BEFORE INSERT OR UPDATE ON sru_test
    FOR EACH ROW EXECUTE PROCEDURE  set_updated();

insert into sru_test(id,data) VALUES (1,'Data 1');
insert into sru_test(id,data) VALUES (2,'Data 2');

select * from sru_test;

update sru_test set data = 'NEW';

select * from sru_test;

update sru_test set data = 'NEW';

select * from sru_test;

update sru_test set data = 'ALTERED'  where id = 1;

select * from sru_test;

update sru_test set data = 'NEW' where id = 2;

select * from sru_test;



回答3:


Postgres is getting UPSERT support . It is currently in the tree since 8 May 2015 (commit):

This feature is often referred to as upsert.

This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted.

A snapshot is available for download. It has not yet made a release.




回答4:


The RETURNING clause enables you to chain your queries; the second query uses the results from the first. (in this case to avoid re-touching the same rows) (RETURNING is available since postgres 8.4)

Shown here embedded in a a function, but it works for plain SQL, too

DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;

CREATE TABLE my_table
        ( updated_time timestamp NOT NULL DEFAULT now()
        , updated_username varchar DEFAULT '_none_'
        , criteria1 varchar NOT NULL
        , criteria2 varchar NOT NULL
        , value1 varchar
        , value2 varchar
        , PRIMARY KEY (criteria1,criteria2)
        );

INSERT INTO  my_table (criteria1,criteria2,value1,value2)
SELECT 'C1_' || gs::text
        , 'C2_' || gs::text
        , 'V1_' || gs::text
        , 'V2_' || gs::text
FROM generate_series(1,10) gs
        ;

SELECT * FROM my_table ;

CREATE function funky(_criteria1 text,_criteria2 text, _newvalue1 text, _newvalue2 text)
RETURNS VOID
AS $funk$
WITH ins AS (
        INSERT INTO my_table(criteria1, criteria2, value1, value2, updated_username)
        SELECT $1, $2, $3, $4, COALESCE(current_user, 'evgeny' )
        WHERE NOT EXISTS (
                SELECT * FROM my_table nx
                WHERE nx.criteria1 = $1 AND nx.criteria2 = $2
                )
        RETURNING criteria1 AS criteria1, criteria2 AS criteria2
        )
        UPDATE my_table upd
        SET value1 = $3, value2 = $4
        , updated_time = now()
        , updated_username = COALESCE(current_user, 'evgeny')
        WHERE 1=1
        AND criteria1 = $1 AND criteria2 = $2 -- key-condition
        AND (value1 <> $3 OR value2 <> $4 )   -- row must have changed
        AND NOT EXISTS (
                SELECT * FROM ins -- the result from the INSERT
                WHERE ins.criteria1 = upd.criteria1
                AND ins.criteria2 = upd.criteria2
                )
        ;
$funk$ language sql
        ;

SELECT funky('AA', 'BB' , 'CC', 'DD' );            -- INSERT
SELECT funky('C1_3', 'C2_3' , 'V1_3', 'V2_3' );    -- (null) UPDATE 
SELECT funky('C1_7', 'C2_7' , 'V1_7', 'V2_7777' ); -- (real) UPDATE 

SELECT * FROM my_table ;

RESULT:

        updated_time        | updated_username | criteria1 | criteria2 | value1 | value2  
----------------------------+------------------+-----------+-----------+--------+---------
 2013-03-13 16:37:55.405267 | _none_           | C1_1      | C2_1      | V1_1   | V2_1
 2013-03-13 16:37:55.405267 | _none_           | C1_2      | C2_2      | V1_2   | V2_2
 2013-03-13 16:37:55.405267 | _none_           | C1_3      | C2_3      | V1_3   | V2_3
 2013-03-13 16:37:55.405267 | _none_           | C1_4      | C2_4      | V1_4   | V2_4
 2013-03-13 16:37:55.405267 | _none_           | C1_5      | C2_5      | V1_5   | V2_5
 2013-03-13 16:37:55.405267 | _none_           | C1_6      | C2_6      | V1_6   | V2_6
 2013-03-13 16:37:55.405267 | _none_           | C1_8      | C2_8      | V1_8   | V2_8
 2013-03-13 16:37:55.405267 | _none_           | C1_9      | C2_9      | V1_9   | V2_9
 2013-03-13 16:37:55.405267 | _none_           | C1_10     | C2_10     | V1_10  | V2_10
 2013-03-13 16:37:55.463651 | postgres         | AA        | BB        | CC     | DD
 2013-03-13 16:37:55.472783 | postgres         | C1_7      | C2_7      | V1_7   | V2_7777
(11 rows)



回答5:


Start a transaction. Use a select to see if the data you'd be inserting already exists, if it does, do nothing, otherwise update, if it does not exist, then insert. Finally close the transaction.



来源:https://stackoverflow.com/questions/3464750/postgres-upsert-insert-or-update-only-if-value-is-different

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!