问题
Using GA BigQuery data, I am trying to calculate the total pageviews across 3 dimensions: date, device category, and a custom dimension (called "type" here).
So the desired output is:
So the total pageviews should be listed for each date, device, and type combination.
I used the following query to get this result. I need to unnest the "type" dimension because it is a custom dimension.
#standardsql
SELECT date, device, cd6_type, SUM(pvs) AS pageviews
FROM(
SELECT
date,
fullvisitorID,
visitID,
totals.pageviews AS pvs,
device.deviceCategory AS device
, MAX(IF(hcd.index = 6, hcd.value, NULL)) AS cd6_type
FROM `ga360-173318.62903073.ga_sessions_*` AS t,
UNNEST (t.hits) AS h,
UNNEST (h.customDimensions) AS hcd
WHERE _table_suffix BETWEEN (SELECT FORMAT_DATE('%Y%m%d', '2019-07-08'))
AND (SELECT FORMAT_DATE('%Y%m%d', '2019-07-08'))
AND h.type = "PAGE"
GROUP BY
date,
fullVisitorID,
visitID,
totals.pageviews,
device
)
GROUP BY date, device, cd6_type
The problem is that my results do not match what appears in GA; the query returns fewer results. In GA, the above results are:
- 180,812 mobile, Type A pageviews (compared to 149,149 in GBQ)
- 30,949 tablet, Type A pageviews (compared to 16,863 in GBQ)
I'm not sure why they don't match across the 2 systems, and am wondering how others calculate total pageviews across dimensions.
回答1:
You're cross joining with customdimensions, so you're not counting pages, but custom dimensions on pages. Just don't do this cross join, you don't need it if you get your custom dimension using a subquery.
#standardsql
SELECT
date,
device.deviceCategory AS device
,(SELECT hcd.value FROM h.customdimensions AS hcd WHERE hcd.index = 6 ) AS cd6_type
,COUNT(1) as pageviews
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*` AS t,
UNNEST(t.hits) AS h
WHERE _table_suffix between '20170801' and '20170801'
AND h.type = "PAGE"
GROUP BY date, device, cd6_type
来源:https://stackoverflow.com/questions/56992924/ga-bigquery-calculating-pageviews-with-a-custom-dimension