Storing and comparing unique combinations

感情迁移 提交于 2019-12-07 02:55:29
Erwin Brandstetter

Store as array (denormalized)

I would consider the additional module intarray that provides the convenient (and fast) functions uniq() and sort(). In a typical modern Postgres installation it's as easy as:

CREATE EXTENSION intarray;

Using these, a simple CHECK constraint can enforce ascending arrays with distinct elements.

CHECK (uniq(sort(cat_arr)) = cat_arr)

You can additionally (optionally) have a trigger that normalizes array values ON INSERT OR UPDATE automatically. Then you can just pass any array (possibly unsorted and with dupes) and everything just works. Like:

CREATE OR REPLACE FUNCTION trg_search_insup_bef()
  RETURNS trigger AS
$func$
BEGIN
   NEW.cat_arr := uniq(sort(NEW.cat_arr);
   RETURN NEW;
END
$func$ LANGUAGE plpgsql;

CREATE TRIGGER insup_bef
BEFORE INSERT OR UPDATE OF cat_arr ON search
FOR EACH ROW
EXECUTE PROCEDURE trg_search_insup_bef();

The additional module intarray is optional, there are other ways:

But the intarray functions deliver superior performance.

Then you can just create a UNIQUE constraint on the array column to enforce uniqueness of the whole array.

UNIQUE (cat_arr)

I wrote more about the advantages of combining (very strict and reliable) constraints with (less reliable but more convenient) triggers in this related answer just two days ago:

If, for each combination, all you need to store per category is the ID (and no additional info), this should be good enough.
However, referential integrity is not easily ensured this way. There are no foreign key constraints for array elements (yet) - like documented in your link: If one of the categories is deleted or you change IDs, references break ...

Normalized schema

If you need to store more or you'd rather go with a normalized schema to enforce referential integrity or for some reason, you can do that, too, and add a trigger to populate a hand-made materialized view (a redundant table) and enforce uniqueness in a similar way:

CREATE TABLE search (
  search_id serial PRIMARY KEY
, ... more columns
);

CREATE TABLE cat (
  cat_id serial PRIMARY KEY
, cat text NOT NULL
);

CREATE TABLE search_cat (
  search_id int REFERENCES search ON DELETE CASCADE
, cat_id    int REFERENCES cat
, PRIMARY KEY (search_id, cat_id)
);

Related answer (not for unique combinations, but for unique elements) that demonstrates the trigger:

For your current approach you can use string_agg to have a string representation of all Categories in each CategoriesCombinations and check that against the new search:

SELECT CombinationId
FROM CombinedCategories
WHERE string_agg(CategoryId, ',') = '84,95,102'
GROUP BY CombinationId

But the simpler approach would be calculating a unique hash for each search based on all parameters and store that in Searches table and compare hash of the new search against search history.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!