Command to import multiple files from Cloud Storage into BigQuery

£可爱£侵袭症+ 提交于 2021-01-29 09:36:54

问题


I've figured that this command lists paths to all files:

gsutil ls "gs://bucket/foldername/*.csv"

This command imports a file to BQ and autodetects schema:

bq load --autodetect --source_format=CSV dataset.tableName gs://bucket/foldername/something.csv

Now I need to make it work together to import all files to respective tables in BQ. If table exists, then replace it. Could you give me a hand?


回答1:


First, create a file with all the list with all the folders you want to load into BigQuery:

gsutil ls "gs://bucket/foldername/*.csv" > allmynicetables.txt

Then, create a simple loop to repeat the load operation for every csv file listed on allmynicetables.txt:

while read p ; do bq load --autodetect --replace=true --source_format=CSV dataset.tableName $p ; done < allmynicetables.txt

Just a couple of clarifications:

--replace=true does the trick to overwrite existing table.

Also, I am not sure why you put dataset.tableName, are you always copying to the same dataset? Can you extract the desired dataset/table name from the name of your .csv source file? This is not clear to me from your question, please clarify.



来源:https://stackoverflow.com/questions/61210660/command-to-import-multiple-files-from-cloud-storage-into-bigquery

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!