Insert large amount of data to BigQuery via bigquery-python library

怎甘沉沦 提交于 2019-12-01 16:47:45
Felipe Hoffa

When handling large files don't use streaming, but batch load: Streaming will easily handle up to 100,000 rows per second. That's pretty good for streaming, but not for loading large files.

The sample code linked is doing the right thing (batch instead of streaming), so what we see is a different problem: This sample code is trying to load all this data straight into BigQuery, but the uploading through POST part fails. gsutil has a more robust uploading algorithm than just a plain POST.

Solution: Instead of loading big chunks of data through POST, stage them in Google Cloud Storage first, then tell BigQuery to read files from GCS.

See also BigQuery script failing for large file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!