I have a table with the an array column type:
title tags
\"ridealong\";\"{comedy,other}\"
\"ridealong\";\"{comedy,tragedy}\"
\"freddyjason\";\"{horror
Approach 1: define a custom aggregate. Here's one I wrote earlier.
CREATE TABLE my_test(title text, tags text[]);
INSERT INTO my_test(title, tags) VALUES
('ridealong', '{comedy,other}'),
('ridealong', '{comedy,tragedy}'),
('freddyjason', '{horror,silliness}');
CREATE AGGREGATE array_cat_agg(anyarray) (
SFUNC=array_cat,
STYPE=anyarray
);
select title, array_cat_agg(tags) from my_test group by title;
... or since you don't want to preserve order and want to deduplicate, you could use a LATERAL
query like:
SELECT title, array_agg(DISTINCT tag ORDER BY tag)
FROM my_test, unnest(tags) tag
GROUP BY title;
in which case you don't need the custom aggregate. This one is probably a fair bit slower for big data sets due to the deduplication. Removing the ORDER BY
if not required may help, though.
The obvious solution would be the LATERAL join (which also suggested by @CraigRinger), but that is added to PostgreSQL in 9.3.
In 9.1 you cannot avoid the sub-query, but you can simplify it:
SELECT title, array_agg(DISTINCT tag)
FROM (SELECT title, unnest(tags) FROM my_test) AS t(title, tag)
GROUP BY title;
SQL Fiddle