PostgreSQL multiple nullable columns in unique constraint

我只是一个虾纸丫 提交于 2019-12-01 15:05:42
Erwin Brandstetter

You are striving for compatibility with your existing Oracle and SQL Server implementations.
Here is a presentation comparing physical row storage formats of the three involved RDBS.

Since Oracle does not implement NULL values at all in row storage, it can't tell the difference between an empty string and NULL anyway. So wouldn't it be prudent to use empty strings ('') instead of NULL values in Postgres as well - for this particular use case?

Define columns included in the unique constraint as NOT NULL DEFAULT '', problem solved:

CREATE TABLE example (
   example_id serial PRIMARY KEY
 , field1 text NOT NULL DEFAULT ''
 , field2 text NOT NULL DEFAULT ''
 , field3 text NOT NULL DEFAULT ''
 , field4 text NOT NULL DEFAULT ''
 , field5 text NOT NULL DEFAULT ''
 , CONSTRAINT example_index UNIQUE (field1, field2, field3, field4, field5)
);

Notes

  • What you demonstrate in the question is a unique index:

    CREATE UNIQUE INDEX ...
    

    not the unique constraint you keep talking about. There are subtle, important differences!

    I changed that to an actual constraint like you made it the subject of the post.

  • The keyword ASC is just noise, since that is the default sort order. I left it away.

  • Using a serial PK column for simplicity which is totally optional but typically better than numbers stored as text.

Working with it

Just omit empty / null fields from the INSERT:

INSERT INTO example(field1) VALUES ('F1_DATA');
INSERT INTO example(field1, field2, field5) VALUES ('F1_DATA', 'F2_DATA', 'F5_DATA');

Repeating any of theses inserts would violate the unique constraint.

Or if you insist on omitting target columns (which is a bit of antipattern in persisted INSERT statements):
Or for bulk inserts where all columns need to be listed:

INSERT INTO example VALUES
  ('1', 'F1_DATA', DEFAULT, DEFAULT, DEFAULT, DEFAULT)
, ('2', 'F1_DATA','F2_DATA', DEFAULT, DEFAULT,'F5_DATA');

Or simply:

INSERT INTO example VALUES
  ('1', 'F1_DATA', '', '', '', '')
, ('2', 'F1_DATA','F2_DATA', '', '','F5_DATA');

Or you can write a trigger BEFORE INSERT OR UPDATE that converts NULL to ''.

Alternative solutions

If you need to use actual NULL values I would suggest the unique index with COALESCE like you mentioned as option (2) and @wildplasser provided as his last example.

The index on an array like @Rudolfo presented is simple, but considerably more expensive. Array handling isn't very cheap in Postgres and there is an array overhead similar to that of a row (24 bytes):

Arrays are limited to columns of the same data type. You could cast all columns to text if some are not, but it will typically further increase storage requirements. Or you could use a well-known row type for heterogeneous data types ...

A corner case: array (or row) types with all NULL values are considered equal (!), so there can only be 1 row with all involved columns NULL. May or may not be as desired. If you want to disallow all columns NULL:

Third method: use IS NOT DISTINCT FROM insted of = for comparing the key columns. (This could make use of the existing index on the candidate natural key) Example (look at the last column)

SELECT *
    , EXISTS (SELECT * FROM example x
     WHERE x.FIELD1 IS NOT DISTINCT FROM e.FIELD1
     AND x.FIELD2 IS NOT DISTINCT FROM e.FIELD2
     AND x.FIELD3 IS NOT DISTINCT FROM e.FIELD3
     AND x.FIELD4 IS NOT DISTINCT FROM e.FIELD4
     AND x.FIELD5 IS NOT DISTINCT FROM e.FIELD5
     AND x.ID <> e.ID
    ) other_exists
FROM example e
    ;

Next step would be to put that into a trigger function, and put a trigger on it. (don't have the time now, maybe later)


And here is the trigger-function (which is not perfect yet, but appears to work):


CREATE FUNCTION example_check() RETURNS trigger AS $func$
BEGIN
    -- Check that empname and salary are given
    IF EXISTS (
     SELECT 666 FROM example x
     WHERE x.FIELD1 IS NOT DISTINCT FROM NEW.FIELD1
     AND x.FIELD2 IS NOT DISTINCT FROM NEW.FIELD2
     AND x.FIELD3 IS NOT DISTINCT FROM NEW.FIELD3
     AND x.FIELD4 IS NOT DISTINCT FROM NEW.FIELD4
     AND x.FIELD5 IS NOT DISTINCT FROM NEW.FIELD5
     AND x.ID <> NEW.ID
            ) THEN
        RAISE EXCEPTION 'MultiLul BV';
    END IF;


    RETURN NEW;
END;
$func$ LANGUAGE plpgsql;

CREATE TRIGGER example_check BEFORE INSERT OR UPDATE ON example
  FOR EACH ROW EXECUTE PROCEDURE example_check();

UPDATE: a unique index can sometimes be wrapped into a constraint (see postgres-9.4 docs, final example ) You do need to invent a sentinel value; I used the empty string '' here.


CREATE UNIQUE INDEX ex_12345 ON example
        (coalesce(FIELD1, '')
        , coalesce(FIELD2, '')
        , coalesce(FIELD3, '')
        , coalesce(FIELD4, '')
        , coalesce(FIELD5, '')
        )
        ;

ALTER TABLE example
        ADD CONSTRAINT con_ex_12345
        USING INDEX ex_12345;

But the "functional" index on coalesce() is not allowed in this construct. The unique index (OP's option 2) still works, though:


ERROR:  index "ex_12345" contains expressions
LINE 2:  ADD CONSTRAINT con_ex_12345
             ^
DETAIL:  Cannot create a primary key or unique constraint using such an index.
INSERT 0 1
INSERT 0 1
ERROR:  duplicate key value violates unique constraint "ex_12345"

This actually worked well for me:

CREATE UNIQUE INDEX index_name ON table_name ((
   ARRAY[field1, field2, field3, field4]
));

I don't know about how performance is affected, but it should be close to ideal (depending on how well optimized arrays are in postres)

You can create a rule to insert ALL NULL values instead of original table to partitions like partition_field1_nullable, partition_fiend2_nullable, etc. This way you create a unique index on original table only (with no nulls). This will allow you to insert not null only to orig table (having uniqness), and as many not null (and not unique accordingly) values to "nullable partitions". And you can apply COALESCE or trigger method against nullable partitions only, to avoid many scattered partial indexes and trigger against every DML on original table...

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!