UNNEST expression references column which is neither grouped nor aggregated

坚强是说给别人听的谎言 提交于 2019-12-01 21:49:46

问题


Google Analytics BigQuery tables are structured like this (Legacy SQL notations - only relevant fields are shown):

visitId:                      INTEGER
hits:                         RECORD/REPEATED
hits.hour:                    INTEGER

On one such table, the following query works well:

SELECT
  visitId,
  MIN(h.hour) AS firstHitHour
FROM
  `my-table.ga_sessions_20161122`, UNNEST(hits) AS h
GROUP BY
  visitId

But using this alternative syntax:

SELECT
  visitId,
  (SELECT MIN(hour) FROM UNNEST(hits)) as firstHitHour
FROM
  `my-table.ga_sessions_20161122`
GROUP BY
  visitId

Triggers the following error:

Error: UNNEST expression references column hits which is neither grouped nor aggregated

I understand that UNNEST(hits) must be somehow grouped or aggregated, but since this column is an array (repeated), what does it mean exactly?

If I try to "group the column hits", as requested, like this:

(SELECT MIN(hour) FROM UNNEST(hits) as h GROUP BY h) as firstHitHour

Then I get a Grouping by expressions of type STRUCT is not allowed error.

How can this alternative syntax be corrected to produce the same result as the first one?


回答1:


My first Answer is for original version of this question.
When I answered, I realized you have changed it to quite different one :o)

So below answer is for most recent version of your question:

I think that in "alternative" version you just do not need GROUP BY at all, because you operate on original (un-flattened) row by row and for each row (visitId) you calculate firstHitHour

SELECT
  visitId,
  (SELECT MIN(hour) FROM UNNEST(hits)) as firstHitHour
FROM
  `my-table.ga_sessions_20161122`

In your initial query - you kind of flattening all records for each row - so that's why you need then to group them back




回答2:


Try below (it uses example from your original question):

SELECT
  visitId, source, medium, browser,
  MIN(hour) AS firstHitHour,
  LOGICAL_OR(hasValue) AS hasValue
FROM (
  SELECT
    visitId,
    trafficSource.source AS source,
    trafficSource.medium AS medium,
    device.browser AS browser,
    h.hour AS hour,
    EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventCategory = "SomeValue") AS hasValue
  FROM
   `my-table.ga_sessions_20161122`, UNNEST(hits) AS h
)
GROUP BY
visitId, source, medium, browser;


来源:https://stackoverflow.com/questions/40792981/unnest-expression-references-column-which-is-neither-grouped-nor-aggregated

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!