I want “live materialized views”, with the latest info for any row

后端 未结 1 945
萌比男神i
萌比男神i 2021-01-06 09:12

I saw this solution as an alternative to materialized views:

  • I want a "materialized view" of the latest records

But it\'s using the sc

1条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-06 09:57

    2018-10: BigQuery doesn't support materialized views, but you can use this approach:

    • Use the previous solution to "materialize" a summary of the latest data, until the time that scheduled query ran.
    • Create a view that combines the materialized data, with a live view of the latest data on the append-only table.

    Code would look like this:

    CREATE OR REPLACE VIEW `wikipedia_vt.just_latest_rows_live` AS
    
    SELECT latest_row.* 
    FROM (
      SELECT ARRAY_AGG(a ORDER BY datehour DESC LIMIT 1)[OFFSET(0)] latest_row
      FROM (
        SELECT * FROM `fh-bigquery.wikipedia_vt.just_latest_rows`
        # previously "materialized" results
        UNION ALL 
        SELECT * FROM `fh-bigquery.wikipedia_v3.pageviews_2018`
        # append-only table, source of truth
        WHERE datehour > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 2 DAY )
    
      ) a
      GROUP BY title
    )
    

    Note that BigQuery is able to use TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 2 DAY ) to prune partitions effectively.

    0 讨论(0)
提交回复
热议问题