Aggregate a single column in query with many columns

六月ゝ 毕业季﹏ 提交于 2019-11-27 02:23:42
Erwin Brandstetter

Simple query

This can be much simpler with PostgreSQL 9.1 or later. As explained in this closely related answer:

It is enough to GROUP BY the primary key of a table. Since:

foo1 is a primary key

.. you can simplify your example to:

SELECT foo1, foo2, foo3, foo4, foo5, foo6, string_agg(aggregated_field, ', ')
FROM   tbl1
GROUP  BY 1
ORDER  BY foo7, foo8;  -- have to be spelled out, since not in select list!

Query with multiple tables

However, since you have:

many more fields and LEFT JOINs, the important part is that all these fields have 1 to 1 or 1 to 0 relationship except one field that is 1 to n which I want to aggregate

.. it should be faster and simpler to aggregate first, join later:

SELECT t1.foo1, t1.foo2, ...
     , t2.bar1, t2.bar2, ...
     , a.aggregated_col 
FROM   tbl1 t1
LEFT   JOIN tbl2 t2 ON ...
...
LEFT   JOIN (
   SELECT some_id, string_agg(agg_col, ', ') AS aggregated_col
   FROM   agg_tbl a ON ...
   GROUP  BY some_id
   ) a ON a.some_id = ?.some_id
ORDER  BY ...

This way the big portion of your query does not need aggregation at all.

I recently provided a test case in an SQL Fiddle to prove the point in this related answer:

Since you are referring to this related answer: No, DISTINCT is not going to help at all in this case.

If the main problem is that the fields (foox) are computed then this can help:

SELECT foo1, foo2, foo3, foo4, foo5, foo6, string_agg(aggregated_field, ', ')
FROM tbl1
GROUP BY 1, 2, 3, 4, 5, 6
ORDER BY 5, 6

The 1, 2... are the fields in the order they appear in the select list.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!