Partition by week/year/month to get over the partition limit?

久未见 提交于 2019-11-27 14:33:40

Instead of partitioning by day, you could partition by week/month/year.

In my case each year of data contains around ~3GB of data, so I'll get the most benefits from clustering if I partition by year.

For this, I'll create a year date column, and partition by it:

CREATE TABLE `fh-bigquery.flights.ontime_201903`
PARTITION BY FlightDate_year
CLUSTER BY Origin, Dest 
AS
SELECT *, DATE_TRUNC(FlightDate, YEAR) FlightDate_year
FROM `fh-bigquery.flights.raw_load_fixed`

Note that I created the extra column DATE_TRUNC(FlightDate, YEAR) AS FlightDate_year in the process.

Table stats:

Since the table is clustered, I'll get the benefits of partitioning even if I don't use the partitioning column (year) as a filter:

SELECT *
FROM `fh-bigquery.flights.ontime_201903`
WHERE FlightDate BETWEEN '2008-01-01' AND '2008-01-10'

Predicted cost: 83.4 GB
Actual cost: 3.2 GB
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!