I have a table with N columns. Let's call them c1
, c2
, c3
, c4
, ... cN
. Among multiple rows, I want to get a single row with COUNT DISTINCT(cX)
for each X in [1, N].
c1 | c2 | ... | cn
0 | 4 | ... | 1
Is there a way I can do this (in a stored procedure) without writing every column name into the query manually?
Why?
We've had a problem where bugs in application servers mean we rewrite good column values with garbage inserted later. To solve this, I'm storing the information log-structure, where each row represents a logical UPDATE
query. Then, when given a signal that the record is complete, I can determine if any values were (erroneously) overwritten.
An example of a single correct record in multiple rows: there is at most one value for each column.
| id | initialize_time | start_time | end_time |
| 1 | 12:00am | NULL | NULL |
| 1 | 12:00am | 1:00pm | NULL |
| 1 | 12:00am | NULL | 2:00pm |
Reconciled row:
| 1 | 12:00am | 1:00pm | 2:00pm |
An example of an irreconcilable record that I want to detect:
| id | initialize_time | start_time | end_time |
| 1 | 12:00am | NULL | NULL |
| 1 | 12:00am | 1:00pm | NULL |
| 1 | 9:00am | 1:00pm | 2:00pm | -- New initialize time => irreconcilable!
You need dynamic SQL for that, which means you have to create a function or run a DO
command. Since you cannot return values directly from the latter, a plpgsql function it is:
CREATE OR REPLACE function f_count_all(_tbl text
, OUT columns text[], OUT counts bigint[])
RETURNS record LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE (
SELECT 'SELECT
ARRAY[' || string_agg('''' || quote_ident(attname) || '''', ', ') || '],
ARRAY[' || string_agg('count(' || quote_ident(attname) || ')', ', ') || ']
FROM ' || _tbl
FROM pg_attribute
WHERE attrelid = _tbl::regclass
AND attnum >= 1 -- exclude tableoid & friends (neg. attnum)
AND attisdropped is FALSE -- exclude deleted columns
GROUP BY attrelid
)
INTO columns, counts;
END
$func$;
Call:
SELECT * FROM f_count_all('myschema.mytable');
Returns:
columns | counts
--------------+--------
{c1, c2, c3,} | {17 1,0}
More explanation and links about dynamic SQL and EXECUTE
in this related question - or a couple more here on SO, try this serach.
Very similar to this question:
postgresql - count (no null values) of each column in a table
You could even try and return a polymorphic record type to get single columns dynamically, but that's rather complex and advanced. Probably too much effort for your case. More in this related answer.
来源:https://stackoverflow.com/questions/13760230/how-to-perform-the-same-aggregation-on-every-column-without-listing-the-columns