问题
I am working with BigQuery. I have two tables:
organisations:
org_code STRING
name STRING
spending:
org STRING
month DATE
quantity INTEGER
code STRING
And then quite a complicated query to get results by each organisation, by month:
SELECT
organisations.org_code AS org,
num.month AS month,
(num.quantity / denom.quantity) AS ratio_quantity
FROM (
SELECT
org_code, name
FROM
[mytable.organisations]) AS organisations
LEFT OUTER JOIN EACH (
SELECT
org,
month,
SUM(quantity) AS quantity
FROM
[mytable.spending]
GROUP BY
org,
month) AS denom
ON
denom.org = organisations.org_code
LEFT OUTER JOIN EACH (
SELECT
org,
month,
SUM(quantity) AS quantity
FROM
[hscic.spending]
WHERE
code LIKE 'XXXX%'
GROUP BY
org,
month) AS num
ON
denom.month = num.month
AND denom.org = num.org
ORDER BY org, month
My final results look like this, with a row per org/month combination:
org,month,ratio_quantity
A81001,2015-10-01 00:00:00 UTC,28
A82001,2015-11-01 00:00:00 UTC,43
A82002,2015-10-01 00:00:00 UTC,16
Now I would like to pivot the results to look like this, with one row per month, and one column per organisation:
month,items.A81001,items.A82002...
2015-10-01 00:00:00 UTC,28,16
2015-11-01 00:00:00 UTC,43,...
Is this possible in the same BigQuery call? Or should I create a new table and pivot it from there? Or should I just do the reshaping in Python?
UPDATE: There are about 500,000 results, fyi.
回答1:
Q. Is this possible in the same BigQuery call? Or should I create a new table and pivot it from there?
In general, you can use that “complicated query” as a subquery for extra logic to be applied to your current result. So, it is definitely doable. But code can quickly become un-manageable or hard to manage – so you can consider writing this result into new table and then pivot it from there
If you stuck with direction of doing pivot (the way you described in your question) - check below link to see detailed intro on how you can implement pivot within BigQuery.
How to scale Pivoting in BigQuery?
Please note – there is a limitation of 10K columns per table - so you are limited with 10K organizations.
You can also see below as simplified examples (if above one is too complex/verbose):
How to transpose rows to columns with large amount of the data in BigQuery/SQL?
How to create dummy variable columns for thousands of categories in Google BigQuery?
Pivot Repeated fields in BigQuery
Q. Or should I just do the reshaping in Python?
If above will not work for you – pivoting on client is always an option but now you should consider client side limitations
Hope this helped!
来源:https://stackoverflow.com/questions/35174110/pivot-data-in-bigquery-sql