google-bigquery

create a table with a column type RECORD

▼魔方 西西 提交于 2020-01-09 11:53:16
问题 I'm using big query and i want to create a job which populates a table with a "record" type columns. The data will be populated by a query - so how can i write a query which returns "record" type columns. Thanks! 回答1: Somehow option proposed by Pentium10 never worked for me in GBQ UI or API Explorer. I might be missing something Meantime, the workaround I found is as in below example SELECT location.state, location.city FROM JS( ( // input table SELECT NEST(CONCAT(state, ',', city)) AS

create a table with a column type RECORD

瘦欲@ 提交于 2020-01-09 11:52:08
问题 I'm using big query and i want to create a job which populates a table with a "record" type columns. The data will be populated by a query - so how can i write a query which returns "record" type columns. Thanks! 回答1: Somehow option proposed by Pentium10 never worked for me in GBQ UI or API Explorer. I might be missing something Meantime, the workaround I found is as in below example SELECT location.state, location.city FROM JS( ( // input table SELECT NEST(CONCAT(state, ',', city)) AS

Does Google BigQuery charge for GetQueryResults()

夙愿已清 提交于 2020-01-07 08:37:23
问题 I am running the Query in C#.NET application using .NET API library which is added through Nu-Get package manager. My Query: Select * From DataSetID.TableID LIMIT 10000 Note: Data Processing for the above query is 1GB (get it from Web-UI). When i am running the same query in C#.NET application, i am able to get ~5600 Rows in single request, then i am passing PageToken to GetQueryResults method and get remaining records. (Through pagination). So There are 2 Query request to get 10K records.

Creating a Data Pipeline to BigQuery Using Cloud Functions and Cloud Scheduler

夙愿已清 提交于 2020-01-07 08:07:39
问题 I am trying to build a Data Pipeline that will download the data from this website and push it to a BigQuery Table. def OH_Data_Pipeline(trigger='Yes'): if trigger=='Yes': import pandas as pd import pandas_gbq import datetime schema=[{'name': 'SOS_VOTERID', 'type': 'STRING'},{'name': 'COUNTY_NUMBER', 'type': 'STRING'}, {'name': 'COUNTY_ID', 'type': 'INT64'}, {'name': 'LAST_NAME', 'type': 'STRING'}, {'name': 'FIRST_NAME', 'type': 'STRING'}, {'name': 'MIDDLE_NAME', 'type': 'STRING'}, {'name':

Making a Google BigQuery from Python on Windows

天大地大妈咪最大 提交于 2020-01-07 07:17:11
问题 I am trying to do something which is very simple in other data services. I am trying to make a relatively simple SQL query and return it as a dataframe in python. I am on Windows 10 and using Phython 2.7 (specifically Canopy 1.7.4) Typically this would be done with pandas.read_sql_query but due to some specifics with BigQuery they require a different method pandas.io.gbq.read_gbq This method works fine unless you want to make a Big Query. If you make a Big Query on BigQuery you get the error

Streaming data into google bigquery template table with date partition

你说的曾经没有我的故事 提交于 2020-01-07 06:42:09
问题 I am trying to stream data into Big Query using templateSuffix and a date partition appended to the table name using Java API But I am getting the below exception: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request { "code" : 400, "errors" : [ { "domain" : "global", "location" : "suffix", "locationType" : "other", "message" : "Table name should only contain _, a-z, A-Z, or 0-9.", "reason" : "invalid" } ], "message" : "Table name should only contain _, a-z, A-Z,

Pandas/Google BigQuery: Schema mismatch makes the upload fail

风流意气都作罢 提交于 2020-01-07 04:38:28
问题 The schema in my google table looks like this: price_datetime : DATETIME, symbol : STRING, bid_open : FLOAT, bid_high : FLOAT, bid_low : FLOAT, bid_close : FLOAT, ask_open : FLOAT, ask_high : FLOAT, ask_low : FLOAT, ask_close : FLOAT After I do a pandas.read_gbq I get a dataframe with column dtypes like this: price_datetime object symbol object bid_open float64 bid_high float64 bid_low float64 bid_close float64 ask_open float64 ask_high float64 ask_low float64 ask_close float64 dtype: object

How to create a table in Bigquery using Python when schema keeps changing?

不问归期 提交于 2020-01-07 03:53:32
问题 My data source is based on the events happening in the 3rd party tool. eg: - customer.created , customer.updated , customer.plan.updated . Every event has different json schema. And it is possible even the same event eg: customer.updated may have different schema from previous customer.updated event. I am planning to load this data into BigQuery but it appears that BigQuery doesn't support dynamic schema. I am building a data warehouse and want to store all the events related to customer in

Can I denormalize data in google cloud sql in prep for bigquery

这一生的挚爱 提交于 2020-01-06 18:34:54
问题 Given that bigquery is not meant as a platform to denormalize data, can I denormalize the data in google cloud sql prior to importing into bigquery? I have the following tables: Table1 500M rows, Table2 2M rows, Table3 800K rows, I can't denormalize in our existing relational database for various reasons. So I'd like to do a sql dump of the data base, load it into google cloud sql, then use sql join scripts to create one large flat table to be imported into bigquery. Thanks. 回答1: That should

What will be the wait time before big query executes a query?

白昼怎懂夜的黑 提交于 2020-01-06 16:05:53
问题 Every time I execute a query in Google bigquery in the Explanation tab, I can see that their involves an average waiting time. Is it possible to know the percentage or seconds of wait time? 回答1: Since BigQuery is a managed service, around the glob a lot of customers are using it. It has an internal scheduling system based on the billingTier (explained here https://cloud.google.com/bigquery/pricing#high-compute) and other internals of your project. Based on this the query is scheduled to be