Google Analytics Metrics are inflated when extracting hit level data using BigQuery

前端 未结 4 1506
迷失自我
迷失自我 2020-12-22 11:52

I\'m trying to display the source property name within the Google Analytics roll up property I have linked to bigquery. Issue is, is that when I try the below some of the me

4条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-12-22 12:33

    As you need to compute several values on the hits level maybe unnesting the field hits is the best approach. The downside is that you lose the totals field aggregation for the session level but still you can work it around.

    As an example:

    SELECT
      date,
      CASE
        WHEN REGEXP_CONTAINS(h.sourcePropertyInfo.sourcePropertyTrackingId, r'82272640') THEN 'MUG'
        WHEN h.sourcePropertyInfo.sourcePropertyTrackingId = 'Social' THEN 'Social'ELSE 'Website'
      END AS Property,
      geoNetwork.country AS Country,
      COUNT(DISTINCT CONCAT(CAST(visitId AS STRING),fullVisitorId)) AS visits,
      COUNT(DISTINCT(fullVisitorId)) AS Users,
      h.sourcePropertyInfo.sourcePropertyDisplayName AS display,
      SUM(CASE
          WHEN REGEXP_CONTAINS(h.page.pagepath, r'/') THEN h.latencyTracking.pageLoadTime END) / SUM(CASE
          WHEN REGEXP_CONTAINS(h.page.pagepath, r'/') THEN h.latencyTracking.pageLoadSample END) AS pageloadspeed,
      COUNT(DISTINCT
        CASE
          WHEN totals.newVisits = 1 THEN CONCAT(CAST(visitId AS STRING),fullVisitorId) END) new_visits,
      COUNT(CASE
             WHEN h.type = 'PAGE' THEN h.page.pagepath END) pageviews,
      SUM(CASE
           WHEN (h.isentrance = TRUE AND h.isexit = TRUE) THEN 1 END) bounces,
      COUNT(DISTINCT (CASE
            WHEN device.isMobile = TRUE THEN CONCAT(CAST(visitId AS STRING),fullVisitorId) END)) mobilevisits,
      COUNT(DISTINCT (CASE
            WHEN trafficSource.medium = 'organic' THEN CONCAT(CAST(visitId AS STRING),fullVisitorId) END)) organicvisits,
      SUM(CASE
           WHEN REGEXP_CONTAINS(h.eventInfo.eventAction,'register$|registersuccess|new registration|account signup|registro') THEN 1 END) AS NewRegistrations,
      SUM(CASE
           WHEN REGEXP_CONTAINS(h.eventInfo.eventAction, 'add to cart|add to bag|click to buy|ass to basket|comprar|addtobasket::') THEN 1 END) AS ClickToBuy,
      COUNT(h.transaction.transactionid) transactions
    FROM
      `project_id.dataset_id.ga_sessions_*`,
      UNNEST(hits) AS h
    WHERE
      1 = 1
      AND PARSE_TIMESTAMP('%Y%m%d', REGEXP_EXTRACT(_table_suffix, r'.*_(.*)')) BETWEEN TIMESTAMP('2017-05-01') AND TIMESTAMP('2017-05-01')
    GROUP BY
      date,
      Country,
      display,
      Property
    

    I ran it against our dataset and it seems to be working. Some changes I did:

    • Removed the MAX operation for the Property and added it to the group by.
    • pageviews was considered as the count of hits where hit.type = 'PAGE'. Not sure if this is the same for screenviews though.
    • bounce is computed when there's an entrance and exit event.
    • Total transactions is a count on transaction ids (hopefully this field is being filled in your dataset as well).

提交回复
热议问题