google-bigquery

Airflow BigQueryOperator: how to save query result in a partitioned Table?

不想你离开。 提交于 2020-01-22 12:47:33
问题 I have a simple DAG from airflow import DAG from airflow.contrib.operators.bigquery_operator import BigQueryOperator with DAG(dag_id='my_dags.my_dag') as dag: start = DummyOperator(task_id='start') end = DummyOperator(task_id='end') sql = """ SELECT * FROM 'another_dataset.another_table' """ bq_query = BigQueryOperator(bql=sql, destination_dataset_table='my_dataset.my_table20180524'), task_id='bq_query', bigquery_conn_id='my_bq_connection', use_legacy_sql=False, write_disposition='WRITE

Airflow BigQueryOperator: how to save query result in a partitioned Table?

∥☆過路亽.° 提交于 2020-01-22 12:47:07
问题 I have a simple DAG from airflow import DAG from airflow.contrib.operators.bigquery_operator import BigQueryOperator with DAG(dag_id='my_dags.my_dag') as dag: start = DummyOperator(task_id='start') end = DummyOperator(task_id='end') sql = """ SELECT * FROM 'another_dataset.another_table' """ bq_query = BigQueryOperator(bql=sql, destination_dataset_table='my_dataset.my_table20180524'), task_id='bq_query', bigquery_conn_id='my_bq_connection', use_legacy_sql=False, write_disposition='WRITE

Using a SQL source file with the bigquery cli

允我心安 提交于 2020-01-22 08:33:33
问题 Is it possible to use an input file with the bigquery CLI? bq query < my_query.sql 回答1: If you're using unix (or have cygwin installed on windows), you can use xargs: xargs -a my_query.sql -0 bq query Alternately you can use back-ticks: bq query `cat my_query.sql` Note that bq can only process one command at a time -- if your .sql script has several queries, you'll need to split the file on ; 回答2: On windows I am using this method. Prerequisite is to have each command listed in a single row.

Using a SQL source file with the bigquery cli

人盡茶涼 提交于 2020-01-22 08:32:08
问题 Is it possible to use an input file with the bigquery CLI? bq query < my_query.sql 回答1: If you're using unix (or have cygwin installed on windows), you can use xargs: xargs -a my_query.sql -0 bq query Alternately you can use back-ticks: bq query `cat my_query.sql` Note that bq can only process one command at a time -- if your .sql script has several queries, you'll need to split the file on ; 回答2: On windows I am using this method. Prerequisite is to have each command listed in a single row.

Weird error in BigQuery

旧街凉风 提交于 2020-01-22 08:08:25
问题 I trying to execute query directly from web console https://bigquery.cloud.google.com One time query executed and I got result another time I got the error on the same query: Error: TABLE_QUERY expressions cannot query BigQuery tables. I also tried with different query option "Use Cached Results", "Interactive" and "Batch" behaviour the same. Why it could happens? 回答1: TABLE_QUERY filters are intended to query only metadata. For a brief period of time, it was possible to query table data in

Authorization for accessing BigQuery from R session on server

独自空忆成欢 提交于 2020-01-22 07:53:30
问题 I am using R and package bigrquery to access Bigquery from an R session. This works great as long as I am on my local machine. However, when I try to access Bigquery from R on a remote server it does not work at all. I tried to copy the .httr-oauth file into my home directory on the server but this does not work. I get the error message: Auto-refreshing stale OAuth token. Error in refresh_oauth2.0(self$endpoint, self$app, self$credentials) : client error: (400) Bad Request I really have no

Using a CASE statement to change the value of a new BigQuery column based finding one specific entry inside a PARTITION

落爺英雄遲暮 提交于 2020-01-22 03:30:07
问题 I trying to write some case statements which might change the value of all entries in the call if a particular condition is satisfied INSIDE the partition. Here is the specific context. Imagine that I have a particular data set that was created using the following SQL query: SELECT date, CONCAT(fullVisitorId, STRING(visitId)) AS unique_visit_id, visitId, visitNumber, fullVisitorId, totals.pageviews, totals.bounces, LAG(hits.page.pagePath,1) OVER(PARTITION BY unique_visit_id ORDER BY hits.time

How can I load data in same order as CSV on BigQuery

允我心安 提交于 2020-01-22 02:57:48
问题 Is it possible to load data in the same row order as in the original input CSV file? These files are not sorted in any particular order or by any particular column. Looks like as BigQuery loading is distributed, the order is not predictible, however tends to group nulls first. 回答1: The only way to achieve this given the way BigQuery works behind the scenes would be to add an extra column to the csv that defines the desired order. BigQuery shuffles data around behind the scenes to optimise

How to Query Multiple Firebase Projects in Bigquery?

血红的双手。 提交于 2020-01-22 02:25:22
问题 How to put Select query to pull data from Multiple Firebase Projects How to query abc project & xyz project in a single query. At present i am putting two queries to extract data from Project abc & xyz & its dataset tables. Querying abc Project SELECT app_info.id,event_date,app_info.version, count(*) as sessions, count( distinct user_pseudo_id ) as uniqueusers FROM `abc-1075.analytics_151541058.events_*` where event_name='session_start' and _TABLE_SUFFIX BETWEEN '20190901' AND '20190931'

BigQuery filter per the last Date and use Partition

好久不见. 提交于 2020-01-21 22:18:38
问题 I asked how to filter the last date and got excellent answers (BigQuery, how to use alias in where clause?), they all work, but, they scan the whole table, the field SETTLEMENTDATE is a partition field, is there a way to scan only one partition as an example, I am using this query #standardSQL SELECT * EXCEPT(isLastDate) FROM ( SELECT *, DATE(SETTLEMENTDATE) = MAX(DATE(SETTLEMENTDATE)) OVER() isLastDate FROM `biengine-252003.aemo2.daily` ) WHERE isLastDate edit : please last date is not