PostgreSQL: joining arrays within group by clause

问题

We have a problem grouping arrays into a single array. We want to join the values from two colums into one single array and aggregate these arrays of multiple rows.

Given the following input:

| id | name | col_1 | col_2 |
| 1  |  a   |   1   |   2   |
| 2  |  a   |   3   |   4   |
| 4  |  b   |   7   |   8   |
| 3  |  b   |   5   |   6   |

We want the following output:

| a | { 1, 2, 3, 4 } |
| b | { 5, 6, 7, 8 } |

The order of the elements is important and should correlate with the id of the aggregated rows.

We tried the array_agg function:

SELECT array_agg(ARRAY[col_1, col_2]) FROM mytable GROUP BY name;

Unfortunately, this statement raises an error:

ERROR: could not find array type for data type character varying[]

It seems to be impossible to merge arrays in a group by clause using array_agg.

Any ideas?

回答1:

`UNION ALL`

You could "counter-pivot" with UNION ALL first:

SELECT name, array_agg(c) AS c_arr
FROM  (
   SELECT name, id, 1 AS rnk, col1 AS c FROM tbl
   UNION ALL
   SELECT name, id, 2, col2 FROM tbl
   ORDER  BY name, id, rnk
   ) sub
GROUP  BY 1;

Adapted to produce the order of values you later requested. Per documentation:

The aggregate functions array_agg, json_agg, string_agg, and xmlagg, as well as similar user-defined aggregate functions, produce meaningfully different result values depending on the order of the input values. This ordering is unspecified by default, but can be controlled by writing an ORDER BY clause within the aggregate call, as shown in Section 4.2.7. Alternatively, supplying the input values from a sorted subquery will usually work.

Custom aggregate function

Or you could create a custom aggregate function like discussed in these related answers:
Selecting data into a Postgres array
Is there something like a zip() function in PostgreSQL that combines two arrays?

CREATE AGGREGATE array_agg_mult (anyarray)  (
    SFUNC     = array_cat
   ,STYPE     = anyarray
   ,INITCOND  = '{}'
);

Then you can:

SELECT name, array_agg_mult(ARRAY[col1, col2] ORDER BY id) AS c_arr
FROM   tbl
GROUP  BY 1
ORDER  BY 1;

Or, typically faster, while not SQL standard:

SELECT name, array_agg_mult(ARRAY[col1, col2]) AS c_arr
FROM  (SELECT * FROM tbl ORDER BY name, id) t
GROUP  BY 1;

The added ORDER BY id (which can be appended to such aggregate functions) guarantees your desired result:

{1,2,3,4}
{5,6,7,8}

Or you might be interested in this alternative:

SELECT name, array_agg_mult(ARRAY[ARRAY[col1, col2]] ORDER BY id) AS c_arr
FROM   tbl
GROUP  BY 1
ORDER  BY 1;

Which produces 2-dimensional arrays:

{{1,2},{3,4}}
{{5,6},{7,8}}

回答2:

select n, array_agg(c) as c
from (
    select n, unnest(array[c1, c2]) as c
    from t
) s
group by n

Or simpler

select
    n,
    array_agg(c1) || array_agg(c2) as c
from t
group by n

To address the new ordering requirement:

select n, array_agg(c order by id, o) as c
from (
    select
        id, n,
        unnest(array[c1, c2]) as c,
        unnest(array[1, 2]) as o
    from t
) s
group by n

来源：https://stackoverflow.com/questions/24557344/postgresql-joining-arrays-within-group-by-clause

标签

sql

arrays

postgresql

group-by

postgresql-9.1