Error Loading Large CSV into Google BigQuery

时间秒杀一切 提交于 2019-12-04 14:48:49
Mosha Pasumansky

BigQuery documentation lists various limits for import jobs here: https://cloud.google.com/bigquery/quota-policy#load_jobs In particular it notes, that the limit of compressed CSV file is 4 GBs.

The error message about "not splittable" CSV file can come in two cases:

  1. CSV file was compressed
  2. There is a quoting character mismatch in one of the fields, which makes it look like very long string in that field, also making file not splittable (this is what likely happened in your case).

Try this:

  • Turn off quoting
  • Set separating character to a non occurring character.

bq help load:

--quote: Quote character to use to enclose records. Default is ". To indicate no quote character at all, use an empty string.
-F,--field_delimiter: The character that indicates the boundary between columns in the input file. "\t" and "tab" are accepted names for tab.

This will import each CSV line to a one column table. Split afterwards within BigQuery (with REGEXP_EXTRACT(), SPLIT(), or JavaScript UDF).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!