问题
I have a bucket in google cloud storage with the following naming hierarchy.
gs://<bucket>/events/year=2020/month=03/day=23/hour=12
I've used the following command to create a BigQuery Native table with hive partitioned information to import from google cloud store. This works.
bq load --project_id=<projectId> --source_format=NEWLINE_DELIMITED_JSON --autodetect --hive_partitioning_mode=AUTO --hive_partitioning_source_uri_prefix=gs://<bucket>/events/ <targetTableName> "gs://<bucket>/events/*" "<schema>"
And the following query works as well.
SELECT * from table WHERE year=2020
But, when I try to do the same with an external table, the table is created with the hive partition information. But the queries do not recognise any of the partitions.
bq mkdef --source_format=NEWLINE_DELIMITED_JSON --autodetect --hive_partitioning_mode=AUTO --hive_partitioning_source_uri_prefix="gs://<bucket>/events" --require_hive_partition_filter=True "gs://<bucket>/events/*" > <table_Def>
bq mk --dataset_id=<datasetId> --data_source=google_cloud_storage --external_table_definition=<tableDef> --schema=<schema> --table <tableName>
And the following query does not work.
SELECT * from table WHERE year=2020
The documentation states, that this should be supported. Would someone please be able to tell what I am missing here?
来源:https://stackoverflow.com/questions/60838904/creating-external-table-from-gcs-with-hive-partition-information-in-bigquery-usi