google-bigquery

Providing keyFilename of Google Client Service Account from Google Cloud Storage

痞子三分冷 提交于 2021-02-19 08:15:29
问题 To connect to Google Cloud BigQuery that exists in a different GCP project from a Google Cloud Function, I am creating the BigQuery Client as follows: const {BigQuery} = require('@google-cloud/bigquery'); const options = { keyFilename: 'path/to/service_account.json', projectId: 'my_project', }; const bigquery = new BigQuery(options); But instead of storing the service_account.json in my Cloud Function, I want to store the Service Account in Google Cloud Storage and provide the Google Cloud

how to load data from AWS RDS to Google BigQuery in streaming mode?

删除回忆录丶 提交于 2021-02-19 08:09:07
问题 how to load data from AWS RDS to Google BigQuery in streaming mode? Description: I have data in RDS (SQL Server), and wanted to load this data into Google BigQuery in real-time. 回答1: There is no direct way to insert changes from Amazon RDS to Google Cloud BigQuery. It could be done with a pipeline like this Amazon RDS ----Lambda/DMS----> Kinesis Data Streams -----Lambda----> BigQuery Read changes from Amazon RDS to Kinesis Data Streams using Lambda or use Cloud DMS. You can also push it to

Python: How to update a value in Google BigQuery in less than 40 seconds?

穿精又带淫゛_ 提交于 2021-02-19 03:43:07
问题 I have a table in Google BigQuery that I access and modify in Python using the pandas functions read_gbq and to_gbq . The problem is that appending 100,000 lines takes about 150 seconds while appending 1 line takes about 40 seconds. I would like to update a value in the table rather than append a line, is there a way to update a value in the table using python that is very fast, or faster than 40 seconds? 回答1: Not sure if you can do so using pandas but you sure can using google-cloud library.

Querying a Partitioned table in BigQuery using a reference from a joined table

血红的双手。 提交于 2021-02-19 03:39:19
问题 I would like to run a query that partitions table A using a value from table B. For example: #standard SQL select A.user_id from my_project.xxx A inner join my_project.yyy B on A._partitiontime = timestamp(B.date) where B.date = '2018-01-01' This query will scan all the partitions in table A and will not take into consideration the date I specified in the where clause (for partitioning purposes). I have tried running this query in several different ways but all produced the same result -

Querying a Partitioned table in BigQuery using a reference from a joined table

旧巷老猫 提交于 2021-02-19 03:38:24
问题 I would like to run a query that partitions table A using a value from table B. For example: #standard SQL select A.user_id from my_project.xxx A inner join my_project.yyy B on A._partitiontime = timestamp(B.date) where B.date = '2018-01-01' This query will scan all the partitions in table A and will not take into consideration the date I specified in the where clause (for partitioning purposes). I have tried running this query in several different ways but all produced the same result -

Convert Bigquery results to Pandas Data Frame

末鹿安然 提交于 2021-02-19 02:38:02
问题 Below is the code to convert BigQuery results into Pandas data frame. Im learning Python&Pandas and wonder if i can get suggestion/ideas about any kind of improvements to the code? #...code to run query, that returns 3 columns: 'date' DATE, 'currency' STRING,'rate' FLOAT... rows, total_count, token = query.fetch_data() currency = [] rate = [] dates = [] for row in rows: dates.append(row[0]) currency.append(row[1]) rate.append(row[2]) dict = { 'currency' : currency, 'date' : dates, 'rate' :

Synchronize Amazon RDS with Google BigQuery

旧巷老猫 提交于 2021-02-19 00:45:58
问题 People, the company where I work has some MySQL databases on AWS (Amazon RDS). We are making a POC with BigQuery and what I am researching now is how to replicate the bases to BigQuery (the existing registers and the new ones in the future). My doubts are: How to replicate the MySQL tables and rows to BigQuery. Is there any tool to do that (I am reading about Amazon Database Migration Service)? Should I replicate to Google Cloud SQL and than export to BigQuery? How to replicate the future

Synchronize Amazon RDS with Google BigQuery

非 Y 不嫁゛ 提交于 2021-02-19 00:40:09
问题 People, the company where I work has some MySQL databases on AWS (Amazon RDS). We are making a POC with BigQuery and what I am researching now is how to replicate the bases to BigQuery (the existing registers and the new ones in the future). My doubts are: How to replicate the MySQL tables and rows to BigQuery. Is there any tool to do that (I am reading about Amazon Database Migration Service)? Should I replicate to Google Cloud SQL and than export to BigQuery? How to replicate the future

Default values for columns in Big Query Tables

纵饮孤独 提交于 2021-02-18 21:09:30
问题 Is there a way to set default values for columns in tables in big query? I would like to set 'false' as a default value for a column of boolean data type. 回答1: A nullable column can (trivially) have a NULL default value, but there is no other notion of default in BigQuery (you either insert a particular value or omit the value and it will have the NULL value). That said, if you want to wrap your raw table in a View, you can map a NULL column value to any default that you like. 回答2: There is

Default values for columns in Big Query Tables

萝らか妹 提交于 2021-02-18 21:09:27
问题 Is there a way to set default values for columns in tables in big query? I would like to set 'false' as a default value for a column of boolean data type. 回答1: A nullable column can (trivially) have a NULL default value, but there is no other notion of default in BigQuery (you either insert a particular value or omit the value and it will have the NULL value). That said, if you want to wrap your raw table in a View, you can map a NULL column value to any default that you like. 回答2: There is