问题
I am trying to do pivoting using BigQuery Stored Procedure like explained in this link
Input Table:
Output Required:
First part of stored procedure is to generate list of all values which are to be used to generate new columns like below :
EXECUTE IMMEDIATE (
"SELECT STRING_AGG(' "||aggregation
||"""(IF('||@pivot_col_name||'="'||x.value||'", '||@pivot_col_value||', null)) '||x.value)
FROM UNNEST((
SELECT APPROX_TOP_COUNT("""||pivot_col_name||", @max_columns) FROM `"||table_name||"`)) x"
) INTO header_pivot
USING pivot_col_name AS pivot_col_name, pivot_col_value AS pivot_col_value, max_columns AS max_columns;
This query is generating output :
MAX(IF(EXTENDED_PROPERTY_KEY="key1", EXTENDED_PROPERTY_VALUE, null)) key1,
MAX(IF(EXTENDED_PROPERTY_KEY="key2", EXTENDED_PROPERTY_VALUE, null)) key2,
MAX(IF(EXTENDED_PROPERTY_KEY="key3", EXTENDED_PROPERTY_VALUE, null)) key3,
MAX(IF(EXTENDED_PROPERTY_KEY="key4", EXTENDED_PROPERTY_VALUE, null)) key4
The second part :
SELECT STRING_AGG(''||(i+1)) FROM UNNEST(row_ids) WITH OFFSET i
Here I am facing difficulty as offset value is not increasing by 1 for each row_ids[Account_id]
The approach did work for me though with some manipulation but I wanted to understand the underlying query which generates the desired output . I tried to break it down and reached the final query as
SELECT (SELECT STRING_AGG(x) FROM UNNEST([row_ids]) x) # this part extracts individual row_ids ,
(SELECT STRING_AGG(DISTINCT "MAX(IF(pivot_key= '" || pivot_key|| "', pivot_value, NULL)) AS " || pivot_key)
FROM `project.dataset.table`
) # generates string for each key of pivot_key column
FROM `project.dataset.table`
GROUP BY (SELECT STRING_AGG(''||(i+1)) FROM UNNEST([row_ids]) WITH OFFSET i) # generates offset for each row_id , though in this case I see it is always 1
ORDER BY (SELECT STRING_AGG(''||(i+1)) FROM UNNEST([row_ids]) WITH OFFSET i)
But when I try to run the above concatenated query it fails at
a) SELECT STRING_AGG(x) FROM UNNEST([row_ids]) x)
saying that STRING_AGG() should have String values , in my case row_id is integer , so for that I manipulated it as SELECT STRING_AGG('' || x) FROM UNNEST([row_ids]) x)
b) After fixing a) it again fails saying UNNEST expression references column row_id which is neither grouped nor aggregated
Additinally :
Row_Id == Account_Id
Pivot_key == Extended_Property_Key
Pivot_value == Extended_Property_Value
Please explain what is the missing portion here ?
回答1:
The final query executed in your case probably is:
SELECT
account_id,
MAX(
IF
(extended_property_key="Key 3",
extended_property_value,
NULL)) e_Key3,
MAX(
IF
(extended_property_key="Key 2",
extended_property_value,
NULL)) e_Key2,
MAX(
IF
(extended_property_key="Key 1",
extended_property_value,
NULL)) e_Key1
FROM
`your-table`
GROUP BY
1
ORDER BY
1
Keep in mind that this just a static query created by that script. Pivoting tables programmatically using normal SQL
is not possible in BigQuery
来源:https://stackoverflow.com/questions/62659762/facing-issue-in-understanding-bigquery-stored-procedure-used-in-pivoting