Creating External table from GCS with hive partition information in BigQuery using Command Line

北战南征 提交于 2020-04-17 22:05:12

问题


I have a bucket in google cloud storage with the following naming hierarchy.

gs://<bucket>/events/year=2020/month=03/day=23/hour=12

I've used the following command to create a BigQuery Native table with hive partitioned information to import from google cloud store. This works.

bq load --project_id=<projectId> --source_format=NEWLINE_DELIMITED_JSON --autodetect --hive_partitioning_mode=AUTO --hive_partitioning_source_uri_prefix=gs://<bucket>/events/ <targetTableName> "gs://<bucket>/events/*" "<schema>"

And the following query works as well.

SELECT * from table WHERE year=2020

But, when I try to do the same with an external table, the table is created with the hive partition information. But the queries do not recognise any of the partitions.

bq mkdef --source_format=NEWLINE_DELIMITED_JSON --autodetect --hive_partitioning_mode=AUTO --hive_partitioning_source_uri_prefix="gs://<bucket>/events" --require_hive_partition_filter=True "gs://<bucket>/events/*" > <table_Def>

bq mk --dataset_id=<datasetId> --data_source=google_cloud_storage --external_table_definition=<tableDef> --schema=<schema> --table <tableName>

And the following query does not work.

SELECT * from table WHERE year=2020

The documentation states, that this should be supported. Would someone please be able to tell what I am missing here?

来源:https://stackoverflow.com/questions/60838904/creating-external-table-from-gcs-with-hive-partition-information-in-bigquery-usi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!