Understanding self joins and flattening

对着背影说爱祢 提交于 2020-01-06 06:57:07

问题


I'll start with the fact that I'm a newbie and I managed to hack this original query together. I've looked over many examples but I'm just not wrapping my head around self joins and displaying the data I want to see.

I'm feeding BQ with mobile app data daily and thus am querying multiple tables. I'm trying to query for a count of fatal crashes by IMEI by date. This query does give me most of the output I want as it returns Date, IMEI and Count.

However, I want the output to be Date, IMEI, Branch, Truck and Count. user_dim.user_properties.key is a nested field and in my query I'm specifically asking for user_dim.user_properties.key = 'imei_id' and getting it's value in user_dim.user_properties.value.value.string_value.

I don't understand how I would perform the join to also get back the values where user_dim.user_properties.key = 'truck_id' and user_dim.user_properties.key = 'branch_id' and ultimately getting my output to be: Date, IMEI, Branch, Truck and Count in one row.

Thanks for your help.

SELECT
  event_dim.date AS Date,
  user_dim.user_properties.value.value.string_value AS IMEI,
COUNT(*) AS Count
FROM
    FLATTEN( (
    SELECT
      *
    FROM
  TABLE_QUERY([smarttruck-6d137:com_usiinc_android_ANDROID],'table_id CONTAINS "app_events_"')), user_dim.user_properties)
WHERE
  user_dim.user_properties.key = 'imei_id'
  AND event_dim.name = 'app_exception'
  AND event_dim.params.key = 'fatal'
  AND event_dim.params.value.int_value = 1
  AND event_dim.date = '20170807'
GROUP BY
  Date,
  IMEI
ORDER BY
  Count DESC

回答1:


Here is a query that should work for you, using standard SQL:

#standardSQL
SELECT
  event_dim.date AS Date,
  (SELECT value.value.string_value
   FROM UNNEST(user_dim.user_properties)
   WHERE key = 'imei_id') AS IMEI,
  (SELECT value.value.string_value
   FROM UNNEST(user_dim.user_properties)
   WHERE key = 'branch_id') AS branch_id,
  (SELECT value.value.string_value
   FROM UNNEST(user_dim.user_properties)
   WHERE key = 'truck_id') AS truck_id,
  COUNT(*) AS Count
FROM `smarttruck-6d137.com_usiinc_android_ANDROID.app_events_*`
CROSS JOIN UNNEST(event_dim) AS event_dim
WHERE
  event_dim.name = 'app_exception' AND
  EXISTS (
    SELECT 1 FROM UNNEST(event_dim.params)
    WHERE key = 'fatal' AND value.int_value = 1
  ) AND
  event_dim.date = '20170807'
GROUP BY
  Date,
  IMEI,
  branch_id,
  truck_id
ORDER BY
  Count DESC;

A couple of thoughts/suggestions, though:

  • To restrict how much data you scan, you probably want to filter on _TABLE_SUFFIX = '20170807' instead of event_dim.date = '20170807'. This will be cheaper and (if I understand correctly) will return the same results.
  • If the combinations of IMEI, branch_id, and truck_id are unique, there probably isn't a benefit to computing the count, so you can remove the COUNT(*) and also the GROUP BY/ORDER BY clauses.


来源:https://stackoverflow.com/questions/45574941/understanding-self-joins-and-flattening

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!