google-bigquery

BigQuery use conditions to create a table from other tables (manage big number of columns)

夙愿已清 提交于 2020-08-08 05:15:35
问题 I am facing an issue related to a project of mine. Here is the summary of what i would like to do : I have a big daily file (100 Go) with the following extract (no header) : ID_A|segment_1 ID_A|segment_2 ID_B|segment_2 ID_B|segment_3 ID_B|segment_4 ID_B|segment_5 ID_C|segment_1 ID_D|segment_2 ID_D|segment_4 Every ID (from A to D) can be linked to one or multiple segments (from 1 to 5). I would like to process this file in order to have the following result (the result file contains a header)

BigQuery use conditions to create a table from other tables (manage big number of columns)

 ̄綄美尐妖づ 提交于 2020-08-08 05:15:15
问题 I am facing an issue related to a project of mine. Here is the summary of what i would like to do : I have a big daily file (100 Go) with the following extract (no header) : ID_A|segment_1 ID_A|segment_2 ID_B|segment_2 ID_B|segment_3 ID_B|segment_4 ID_B|segment_5 ID_C|segment_1 ID_D|segment_2 ID_D|segment_4 Every ID (from A to D) can be linked to one or multiple segments (from 1 to 5). I would like to process this file in order to have the following result (the result file contains a header)

pandas to gbq claims a schema mismatch while the schema's are exactly the same. On github all the issues are claimed to have been solved in 2017

∥☆過路亽.° 提交于 2020-08-07 06:50:28
问题 I am trying to append a table to a different table through pandas, pulling the data from BigQuery and sending it to a different BigQuery dataset. While the table schema is exactly the same i get the error " "Please verify that the structure and " pandas_gbq.gbq.InvalidSchema: Please verify that the structure and data types in the DataFrame match the schema of the destination table." This error occurred earlier where I went for table overwrites but in this case the datasets are too large to do

List all the tables in a dataset in bigquery using bq CLI and store them to google cloud storage

≡放荡痞女 提交于 2020-08-02 05:34:28
问题 I have around 108 tables in a dataset. I am trying to extract all those tables using the following bash script: # get list of tables tables=$(bq ls "$project:$dataset" | awk '{print $1}' | tail +3) # extract into storage for table in $tables do bq extract --destination_format "NEWLINE_DELIMITED_JSON" --compression "GZIP" "$project:$dataset.$table" "gs://$bucket/$dataset/$table.json.gz" done But it seems that bq ls only show around 50 tables at once and as a result I can not extract them to

List all the tables in a dataset in bigquery using bq CLI and store them to google cloud storage

若如初见. 提交于 2020-08-02 05:33:33
问题 I have around 108 tables in a dataset. I am trying to extract all those tables using the following bash script: # get list of tables tables=$(bq ls "$project:$dataset" | awk '{print $1}' | tail +3) # extract into storage for table in $tables do bq extract --destination_format "NEWLINE_DELIMITED_JSON" --compression "GZIP" "$project:$dataset.$table" "gs://$bucket/$dataset/$table.json.gz" done But it seems that bq ls only show around 50 tables at once and as a result I can not extract them to

Is it possible to add a new field to an existing field of RECORD type in bigquery from UI?

与世无争的帅哥 提交于 2020-08-01 06:30:58
问题 Is it possible to add a new field to an existing field of RECORD type in bigquery? So for example if my current schema is : {u'fields': [{u'mode': u'NULLABLE', u'name': u'test1', u'type': u'STRING'}, {u'fields': [{u'mode': u'NULLABLE', u'name': u'field1', u'type': u'STRING'}], u'mode': u'NULLABLE', u'name': u'recordtest', u'type': u'RECORD'}]} Can I change it to add field "field2" to recordtest? So the new schema will look like: {u'fields': [{u'mode': u'NULLABLE', u'name': u'test1', u'type':

Is it possible to add a new field to an existing field of RECORD type in bigquery from UI?

二次信任 提交于 2020-08-01 06:30:13
问题 Is it possible to add a new field to an existing field of RECORD type in bigquery? So for example if my current schema is : {u'fields': [{u'mode': u'NULLABLE', u'name': u'test1', u'type': u'STRING'}, {u'fields': [{u'mode': u'NULLABLE', u'name': u'field1', u'type': u'STRING'}], u'mode': u'NULLABLE', u'name': u'recordtest', u'type': u'RECORD'}]} Can I change it to add field "field2" to recordtest? So the new schema will look like: {u'fields': [{u'mode': u'NULLABLE', u'name': u'test1', u'type':

BigQuery Python 409 Already Exists: Table

不打扰是莪最后的温柔 提交于 2020-07-31 04:06:27
问题 I'm coding a python script that writes query results to a BQ table . After the first time running the script, it always errors out after that with the following error: google.api_core.exceptions.Conflict: 409 Already Exists: Table project-id.dataset-id . I do not understand why it is attempting to create a table everytime I run the script. Do I have specify any specific parameters? This is from the documentation from google. I'm using this as an example and under the idea that a current table

BigQuery Python 409 Already Exists: Table

谁都会走 提交于 2020-07-31 04:05:35
问题 I'm coding a python script that writes query results to a BQ table . After the first time running the script, it always errors out after that with the following error: google.api_core.exceptions.Conflict: 409 Already Exists: Table project-id.dataset-id . I do not understand why it is attempting to create a table everytime I run the script. Do I have specify any specific parameters? This is from the documentation from google. I'm using this as an example and under the idea that a current table

Big Query Deduplication query example explanation

元气小坏坏 提交于 2020-07-30 07:50:21
问题 Anybody can explain this Bigquery query for deduplication? Why do we need to use [OFFSET(0)]? I think it is used to take the first element in aggregation array right? Isn't that the same as LIMIT 1? Why do we need to aggregation the entire table? Why can we aggregate an entire table in a single cell? # take the one name associated with a SKU WITH product_query AS ( SELECT DISTINCT v2ProductName, productSKU FROM `data-to-insights.ecommerce.all_sessions_raw` WHERE v2ProductName IS NOT NULL )