BigQuery: Select most recent of group of rows with ARRAY type field

风格不统一 提交于 2021-02-11 13:43:24

问题


I have a table with 3 columns: String, Datetime, ARRAY().

Name  |       LastLogin       | FavoriteNumbers
Paul  | "2019-03-03T06:29:35" | (1, 3, 6, 8)
Paul  | "2019-03-03T02:29:35" | (1, 3, 6, 8)
Paul  | "2019-03-01T01:29:35" | (1, 3, 6, 8)
Anna  | "2019-03-03T02:29:35" | (1, 2, 3, 4)
Anna  | "2019-03-03T01:29:35" | (1, 2, 3, 4)
Maya  | "2019-03-02T10:29:35" | (9, 11, 13, 8)

This is the result I want:

Paul  | "2019-03-03T06:29:35" | (1, 3, 6, 8)
Anna  | "2019-03-03T02:29:35" | (1, 2, 3, 4)
Maya  | "2019-03-02T10:29:35" | (9, 11, 13, 8)

I tried to use GROUP BY with ARRAY_AGG to get the latest timestamp for each Name but it doesn't work because GROUP BY can't be used on an ARRAY type field.

How can I get the result that I want? Using Standard SQL.


回答1:


Aggregate into a struct with ARRAY_AGG, then extract the fields:

SELECT
  Name,
  ARRAY_AGG(
    STRUCT(LastLogin, FavoriteNumbers)
    ORDER BY LastLogin DESC LIMIT 1
  )[OFFSET(0)].*
FROM dataset.table
GROUP BY Name


来源:https://stackoverflow.com/questions/54966855/bigquery-select-most-recent-of-group-of-rows-with-array-type-field

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!