Transpose nested rows into columns in bigquery with google analytics data

自作多情 提交于 2020-02-01 06:23:20

问题


I am interested in pulling visitors with custom dimension attributes, where each row is a unique fullvisitorid and columns are desired customdimension.values.

Using the london helmets as an example, here I am pulling visitors with the two custom dimensions I am interested in:

SELECT fullvisitorid, customDimensions.index, customDimensions.value
FROM [google.com:analytics-bigquery:LondonCycleHelmet.ga_sessions_20130910]
where customDimensions.index in (2,3)
group by fullvisitorid, customDimensions.index, customDimensions.value

It gives results like:

+---------------+------------------------+------------------------+
| fullvisitorid | customDimensions_index | customDimensions_value |
+---------------+------------------------+------------------------+
|             1 |                      2 | Bronze                 |
|             1 |                      3 | Yes                    |
|             2 |                      2 | Bronze                 |
|             2 |                      3 | No                     |
|             3 |                      2 | Bronze                 |
|             3 |                      3 | Yes                    |
|             4 |                      2 | Platinum               |
|             4 |                      3 | Yes                    |
+---------------+------------------------+------------------------+

I would like the values transposed, where customDimension_index 2 is color, and customDimension_value 3 is yesno, so results would look like this instead:

+---------------+----------+-------+
| fullvisitorid |  color   | yesno |
+---------------+----------+-------+
|             1 | Bronze   | Yes   |
|             2 | Bronze   | No    |
|             3 | Bronze   | Yes   |
|             4 | Platinum | Yes   |
+---------------+----------+-------+

I could pull one then the other separately and join on fullvisitorid, but hoping to be able to pull out data this way in a single step. Thanks!


回答1:


Here is the solution:

SELECT
  fullvisitorid,
  FIRST(IF(customDimensions.index=2, customDimensions.value, NULL)) color, 
  FIRST(IF(customDimensions.index=3, customDimensions.value, NULL)) yesno
FROM [google.com:analytics-bigquery:LondonCycleHelmet.ga_sessions_20130910]
where customDimensions.index in (2,3)
group by fullvisitorid 

It relies on the fact that any aggregation function, including FIRST, ignores NULLs




回答2:


2020-01 update: Update for #standard SQL

SELECT
  fullvisitorid,
  (SELECT value FROM UNNEST(customDimensions) WHERE index=2) color, 
  (SELECT value FROM UNNEST(customDimensions) WHERE index=3) yesno, 
FROM `google.com:analytics-bigquery.LondonCycleHelmet.ga_sessions_20130910`

Previously:

Mosha's answer is correct, but I wanted to add this one as it takes advantage of the nested nature of the GA records:

SELECT
  fullvisitorid,
  FIRST(IF(customDimensions.index=2, customDimensions.value, NULL)) WITHIN RECORD color, 
  FIRST(IF(customDimensions.index=3, customDimensions.value, NULL)) WITHIN RECORD yesno
FROM [google.com:analytics-bigquery:LondonCycleHelmet.ga_sessions_20130910]
WHERE customDimensions.index in (2,3)

Why: Instead of running a GROUP BY (which consumes resources as it has to look and group by any record that might have the same customerid), WITHIN RECORD looks only inside the individual rows.

If a customerid has more than one row (for example, they visited once Bronze/Yes and afterwards with Platinum/No), the results will emit every row and combination, instead of only the first one.



来源:https://stackoverflow.com/questions/29662325/transpose-nested-rows-into-columns-in-bigquery-with-google-analytics-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!