Calculating a current day 7 day active user with BigQuery?

此生再无相见时 提交于 2020-01-05 04:20:23

问题


To calculate the current day active user should be simple if I'm not mistaken. Simply take today and x days back (7-day-active would be 6 back) and then count the distinct IDs. I have the following query for a 2 day active user:

WITH allTables AS (
  SELECT 
    CONCAT(user_dim.app_info.app_id, ':', user_dim.app_info.app_platform) AS app,
    event.date,
    user_dim.app_info.app_instance_id as users
  FROM `dataset.app_events_intraday_20170407`
  CROSS JOIN
    UNNEST(event_dim) AS event

  UNION ALL
  SELECT 
    CONCAT(user_dim.app_info.app_id, ':', user_dim.app_info.app_platform) AS app,
    event.date,
    user_dim.app_info.app_instance_id as users
  FROM `dataset.app_events_20170406`
  CROSS JOIN
    UNNEST(event_dim) AS event
) SELECT COUNT(DISTINCT(users)) AS unique,
   COUNT(users) as total
FROM allTables

This is for a 2-day active but for a 7day or 30day I would just union all those tables on. Is this correct or would this need modification?


回答1:


Instead of using UNION ALL you should try to use Querying Multiple Tables Using a Wildcard Table

Try something like below

#standardSQL
WITH allTables AS (
  SELECT 
    CONCAT(user_dim.app_info.app_instance_id, ':', user_dim.app_info.app_platform) AS app,
    event.date,
    user_dim.app_info.app_instance_id AS users
  FROM `dataset.app_events_intraday_*`, UNNEST(event_dim) AS event
  WHERE _TABLE_SUFFIX BETWEEN '20170401' AND '20170407' 
  UNION ALL
  SELECT 
    CONCAT(user_dim.app_info.app_instance_id, ':', user_dim.app_info.app_platform) AS app,
    event.date,
    user_dim.app_info.app_instance_id AS users
  FROM `dataset.app_events_*`, UNNEST(event_dim) AS event
  WHERE _TABLE_SUFFIX BETWEEN '20170401' AND '20170407' 
) 
SELECT 
  COUNT(DISTINCT(users)) AS unique,
  COUNT(users) AS total
FROM allTables

You can use below for WHERE clause to make it more generic

WHERE _TABLE_SUFFIX 
   BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 6 DAY)) 
   AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())

Also please note: I changed app_id in user_dim.app_info.app_id to app_instance_id as I thought it was typo on your side - but I can be wrong



来源:https://stackoverflow.com/questions/43283945/calculating-a-current-day-7-day-active-user-with-bigquery

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!