Return column name and distinct values

问题

Say I have a simple table in postgres as the following:

+--------+--------+----------+
|  Car   |  Pet   |   Name   |
+--------+--------+----------+
| BMW    |  Dog   |   Sam    |
| Honda  |  Cat   |   Mary   |
| Toyota |  Dog   |   Sam    |
| ...    |  ...   |   ...    |

I would like to run a sql query that could return the column name in the first column and unique values in the second column. For example:

+--------+--------+
|  Col   |  Vals  |
+--------+--------+
| Car    |  BMW   |
| Car    | Toyota |
| Car    | Honda  |
| Pet    |  Dog   |
| Pet    |  Cat   |
| Name   |  Sam   |
| Name   |  Mary  |
| ...    |  ...   |

I found a bit of code that can be used to return all of the unique values from multiple fields into one column:

-- Query 4b.  (104 ms, 128 ms)
select distinct unnest( array_agg(a)||
                        array_agg(b)||
                        array_agg(c)||
                        array_agg(d) )
from t ;

But I don't understand the code well enough to know how to append the column name into another column.

I also found a query that can return the column names in a table. Maybe a sub-query of this in combination with the "Query 4b" shown above?

回答1:

SQL Fiddle

SELECT distinct
       unnest(array['car', 'pet', 'name']) AS col,
       unnest(array[car, pet, name]) AS vals
FROM t
order by col

回答2:

It's bad style to put set-returning functions in the SELECT list and not allowed in the SQL standard. Postgres supports it for historical reasons, but since LATERAL was introduced Postgres 9.3, it's largely obsolete. We can use it here as well:

SELECT x.col, x.val
FROM   tbl, LATERAL (VALUES ('car', car)
                          , ('pet', pet)
                          , ('name', name)) x(col, val)
GROUP  BY 1, 2
ORDER  BY 1, 2;

You'll find this LATERAL (VALUES ...) technique discussed under the very same question on dba.SE you already found yourself. Just don't stop reading at the first answer.

Up until Postgres 9.4 there was still an exception: "parallel unnest" required to combine multiple set-returning functions in the SELECT list. Postgres 9.4 brought a new variant of unnest() to remove that necessity, too. More:

Unnest multiple arrays in parallel

The new function is also does not derail into a Cartesian product if the number of returned rows should not be exactly the same for all set-returning functions in the SELECT list, which is (was) a very odd behavior. The new syntax variant should be preferred over the now outdated one:

SELECT DISTINCT x.*
FROM   tbl t, unnest('{car, pet, name}'::text[]
                   , ARRAY[t.car, t.pet, t.name]) AS x(col, val)
ORDER  BY 1, 2;

Also shorter and faster than two unnest() calls in parallel.

Returns:

 col  |  val
------+--------
 car  | BMW
 car  | Honda
 car  | Toyota
 name | Mary
 name | Sam
 pet  | Cat
 pet  | Dog

DISTINCT or GROUP BY, either is good for the task.

回答3:

With JSON functions row_to_json() and json_each_text() you can do it not specifying number and names of columns:

select distinct key as col, value as vals
from (
    select row_to_json(t) r
    from a_table t
    ) t,
    json_each_text(r)
order by 1, 2;

SqlFiddle.

来源：https://stackoverflow.com/questions/34319753/return-column-name-and-distinct-values

标签

sql

postgresql

distinct

set-returning-functions