Use google bigquery to build histogram graph

前端 未结 7 1132
再見小時候
再見小時候 2020-12-17 22:11

How can write a query that makes histogram graph rendering easier?

For example, we have 100 million people with ages, we want to draw the histogram/buckets for age 0

7条回答
  •  南笙
    南笙 (楼主)
    2020-12-17 22:26

    With #standardSQL and an auxiliary stats query, we can define the range the histogram should look into.

    Here for the time to fly between SFO and JFK - with 10 buckets:

    WITH data AS ( 
        SELECT *, ActualElapsedTime datapoint
        FROM `fh-bigquery.flights.ontime_201903`
        WHERE FlightDate_year = "2018-01-01" 
        AND Origin = 'SFO' AND Dest = 'JFK'
    )
    , stats AS (
      SELECT min+step*i min, min+step*(i+1)max
      FROM (
        SELECT max-min diff, min, max, (max-min)/10 step, GENERATE_ARRAY(0, 10, 1) i
        FROM (
          SELECT MIN(datapoint) min, MAX(datapoint) max
          FROM data
        )
      ), UNNEST(i) i
    )
    
    SELECT COUNT(*) count, (min+max)/2 avg
    FROM data 
    JOIN stats
    ON data.datapoint >= stats.min AND data.datapoint

    If you need round numbers, see: https://stackoverflow.com/a/60159876/132438

提交回复
热议问题