Insert trigger ends up inserting duplicate rows in partitioned table

问题

I have a partitioned table with (what I think) the appropriate INSERT trigger and a few constraints on it. Somehow, INSERT statements insert 2 rows for each INSERT: one for the parent and one for the appropriate partition.

The setup briefly is the following:

CREATE TABLE foo (
id SERIAL NOT NULL,
d_id INTEGER NOT NULL,
label VARCHAR(4) NOT NULL);

CREATE TABLE foo_0 (CHECK (d_id % 2 = 0)) INHERITS (foo);
CREATE TABLE foo_1 (CHECK (d_id % 2 = 1)) INHERITS (foo);

ALTER TABLE ONLY foo ADD CONSTRAINT foo_pkey PRIMARY KEY (id);
ALTER TABLE ONLY foo_0 ADD CONSTRAINT foo_0_pkey PRIMARY KEY (id);
ALTER TABLE ONLY foo_1 ADD CONSTRAINT foo_1_pkey PRIMARY KEY (id);

ALTER TABLE ONLY foo ADD CONSTRAINT foo_d_id_key UNIQUE (d_id, label);
ALTER TABLE ONLY foo_0 ADD CONSTRAINT foo_0_d_id_key UNIQUE (d_id, label);
ALTER TABLE ONLY foo_1 ADD CONSTRAINT foo_1_d_id_key UNIQUE (d_id, label);

CREATE OR REPLACE FUNCTION foo_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF NEW.id IS NULL THEN
       NEW.id := nextval('foo_id_seq');
    END IF;

    EXECUTE 'INSERT INTO foo_' || (NEW.d_id % 2) || ' SELECT $1.*' USING NEW;
    RETURN NEW;
END
$$
LANGUAGE plpgsql;

CREATE TRIGGER insert_foo_trigger
    BEFORE INSERT ON foo
    FOR EACH ROW EXECUTE PROCEDURE foo_insert_trigger();

Upon further debugging I isolated what's causing it: the fact that the INSERT trigger returns NEW as opposed to just NULL. However I do want my insert statements to return the auto-increment id and if I just return NULL that won't be the case.

What's the solution? Why does returning NEW cause this seemingly "strange" behavior?

UPDATE #1

Well, I know why the rows got inserted twice as it is clear from the documentation of triggers:

Trigger functions invoked by per-statement triggers should always return NULL. Trigger functions invoked by per-row triggers can return a table row (a value of type HeapTuple) to the calling executor, if they choose. A row-level trigger fired before an operation has the following choices:

It can return NULL to skip the operation for the current row. This instructs the executor to not perform the row-level operation that invoked the trigger (the insertion, modification, or deletion of a particular table row).

For row-level INSERT and UPDATE triggers only, the returned row becomes the row that will be inserted or will replace the row being updated. This allows the trigger function to modify the row being inserted or updated.

But my question is still how is it possible to not return NEW and still be able to get the auto-incremented id, or ROW_COUNT for example?

UPDATE #2

I found a solution, but I sure hope that there's a better one. Basically, you can add an AFTER TRIGGER to delete the row inserted into the parent table. This seems horribly inefficient, so if anyone has a better solution, please post it!

For reference the solution is:

CREATE TRIGGER insert_foo_trigger
    BEFORE INSERT ON foo
    FOR EACH ROW EXECUTE PROCEDURE foo_insert_trigger();


CREATE OR REPLACE FUNCTION foo_delete_master() 
RETURNS TRIGGER AS $$
BEGIN
    DELETE FROM ONLY foo WHERE id = NEW.id;
    RETURN NEW;
END
$$
LANGUAGE plpgsql;

CREATE TRIGGER after_insert_foo_trigger
    AFTER INSERT ON foo
    FOR EACH ROW EXECUTE PROCEDURE foo_delete_master();

回答1:

A simpler way is to create stored procedure instead of the triggers, for example add_foo( [parameters] ), which would decide which partition is suitable to insert a row to and return id (or the new record values, including id). For example:

CREATE OR REPLACE FUNCTION add_foo(
    _d_id   INTEGER
,   _label  VARCHAR(4)
) RETURNS BIGINT AS $$
DECLARE
    _rec    foo%ROWTYPE;
BEGIN
    _rec.id := nextval('foo_id_seq');
    _rec.d_id := _d_id;
    _rec.label := _label;
    EXECUTE 'INSERT INTO foo_' || ( _d_id % 2 ) || ' SELECT $1.*' USING _rec;
    RETURN _rec.id;
END $$ LANGUAGE plpgsql;

回答2:

Another solution to this problem is offered by this question: Postgres trigger-based insert redirection without breaking RETURNING

In summary, create a view for your table and then you can use INSTEAD OF to handle the update while still being able to return NEW.

Untested code, but you get the idea:

CREATE TABLE foo_base (
  id SERIAL NOT NULL,
  d_id INTEGER NOT NULL,
  label VARCHAR(4) NOT NULL
);

CREATE OR REPLACE VIEW foo AS SELECT * FROM foo_base;

CREATE TABLE foo_0 (CHECK (d_id % 2 = 0)) INHERITS (foo_base);
CREATE TABLE foo_1 (CHECK (d_id % 2 = 1)) INHERITS (foo_base);

ALTER TABLE ONLY foo_base ADD CONSTRAINT foo_base_pkey PRIMARY KEY (id);
ALTER TABLE ONLY foo_0 ADD CONSTRAINT foo_0_pkey PRIMARY KEY (id);
ALTER TABLE ONLY foo_1 ADD CONSTRAINT foo_1_pkey PRIMARY KEY (id);

ALTER TABLE ONLY foo_base ADD CONSTRAINT foo_base_d_id_key UNIQUE (d_id, label);
ALTER TABLE ONLY foo_0 ADD CONSTRAINT foo_0_d_id_key UNIQUE (d_id, label);
ALTER TABLE ONLY foo_1 ADD CONSTRAINT foo_1_d_id_key UNIQUE (d_id, label);

CREATE OR REPLACE FUNCTION foo_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF NEW.id IS NULL THEN
       NEW.id := nextval('foo_base_id_seq');
    END IF;

    EXECUTE 'INSERT INTO foo_' || (NEW.d_id % 2) || ' SELECT $1.*' USING NEW;
    RETURN NEW;
END
$$
LANGUAGE plpgsql;

CREATE TRIGGER insert_foo_trigger
    INSTEAD OF INSERT ON foo
    FOR EACH ROW EXECUTE PROCEDURE foo_insert_trigger();

来源：https://stackoverflow.com/questions/22026354/insert-trigger-ends-up-inserting-duplicate-rows-in-partitioned-table

标签

sql

postgresql

postgresql-9.1