Computing a moving maximum in BigQuery

后端 未结 3 950
长发绾君心
长发绾君心 2021-01-03 00:18

Given a BigQuery table with some ordering, and some numbers, I\'d like to compute a \"moving maximum\" of the numbers -- similar to a moving average, but for a maximum inste

相关标签:
3条回答
  • 2021-01-03 00:19

    There's an example creating a moving using window function in the docs here.

    Quoting:

    The following example calculates a moving average of the values in the current row and the row preceding it. The window frame comprises two rows that move with the current row.

    #legacySQL
    SELECT
      name,
      value,
      AVG(value)
        OVER (ORDER BY value
              ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
        AS MovingAverage
    FROM
      (SELECT "a" AS name, 0 AS value),
      (SELECT "b" AS name, 1 AS value),
      (SELECT "c" AS name, 2 AS value),
      (SELECT "d" AS name, 3 AS value),
      (SELECT "e" AS name, 4 AS value);
    
    0 讨论(0)
  • 2021-01-03 00:44

    A trick I'm using for rolling windows: CROSS JOIN with a table of numbers. In this case, to have a moving window of 3 years, I cross join with the numbers 0,1,2. Then you can create an id for each group (ending_at_year==year-i) and group by that.

    SELECT ending_at_year, MAX(mean_temp) max_temp, COUNT(DISTINCT year) c
    FROM 
    (
     SELECT mean_temp, year-i ending_at_year, year
     FROM [publicdata:samples.gsod] a
     CROSS JOIN 
      (SELECT i FROM [fh-bigquery:public_dump.numbers_255] WHERE i<3) b
     WHERE station_number=722860
    )
    GROUP BY ending_at_year
    HAVING c=3
    ORDER BY ending_at_year;
    
    0 讨论(0)
  • 2021-01-03 00:44

    I have another way to do the thing you are trying to achieve. See query below

    SELECT word, max(words)
    FROM 
      (SELECT word,
        word_count AS words
      FROM [publicdata:samples.shakespeare]
      WHERE corpus = 'macbeth'), 
      (SELECT word,
        LEAD(word_count, 1) OVER (ORDER BY word) AS words
      FROM [publicdata:samples.shakespeare]
      WHERE corpus = 'macbeth'), 
      (SELECT word,
        LEAD(word_count, 2) OVER (ORDER BY word) AS words
      FROM [publicdata:samples.shakespeare]
      WHERE corpus = 'macbeth')
    group by word order by word
    

    You can try it and compare performance with your approach (I didn't try that)

    0 讨论(0)
提交回复
热议问题