google-bigquery

Export Firebase to Bigquery dataset time to live

核能气质少年 提交于 2021-02-07 19:51:48
问题 Update: I found a solution for my first question meaning changing to never expires: I applied this code to my dataset and the firesbase console now showing never expires bq update --default_partition_expiration 0 myotherproject:mydataset But there is still the second question which is how to retrieve back the data that got expired because the default option was to keep only the last 60 days. ( before someone ask , yes I did start the export and the table was available 3 month ago , it's not a

How can I set up scheduled queries in BigQuery with timezone support (through Python SDK)

时光总嘲笑我的痴心妄想 提交于 2021-02-07 17:27:58
问题 In the BigQuery UI, I can schedule a query with a specific timezone (as you can see in the screenshot below). With these settings, I'm able to schedule my query at the correct local time, but when I try to automate this process with Python I can't see any option to specify the timezone (https://cloud.google.com/bigquery/docs/scheduling-queries) def create_scheduled_query(project_id, dataset_id, query_string, dest_table, write_disposition=WriteDisposition.WRITE_TRUNCATE): parent = client

Google BigQuery possible to do Case-Insensitive REGEXP_Match?

天大地大妈咪最大 提交于 2021-02-07 13:46:00
问题 In Google BigQuery I wanted to check for 'confirm' or 'Confirm': REGEXP_CONTAINS(h.page.PagePath, r'Confirm') or REGEXP_CONTAINS(h.page.PagePath, r'confirm')) I am a Perl person and in Perl we do $foo =~ /confirm/i # case-insensitive Does Google BigQuery have any flags to modify REGEXP_MATCH? I did not see any examples in their online docs. 回答1: REGEXP_CONTAINS uses RE2 library, so you may use inline modifiers like this: REGEXP_CONTAINS(h.page.PagePath, r'(?i)confirm') ^^^^ See RE2 docs: (

Google BigQuery possible to do Case-Insensitive REGEXP_Match?

本秂侑毒 提交于 2021-02-07 13:45:39
问题 In Google BigQuery I wanted to check for 'confirm' or 'Confirm': REGEXP_CONTAINS(h.page.PagePath, r'Confirm') or REGEXP_CONTAINS(h.page.PagePath, r'confirm')) I am a Perl person and in Perl we do $foo =~ /confirm/i # case-insensitive Does Google BigQuery have any flags to modify REGEXP_MATCH? I did not see any examples in their online docs. 回答1: REGEXP_CONTAINS uses RE2 library, so you may use inline modifiers like this: REGEXP_CONTAINS(h.page.PagePath, r'(?i)confirm') ^^^^ See RE2 docs: (

Determine what project id my App Engine code is running on

不羁岁月 提交于 2021-02-07 11:39:28
问题 From within an App Engine app, is there a way to determine the project ID a GAE (App Engine) instance is running on? I want to access a big query table in the same project that the App Engine instance is running in. I'd rather not hard code it in or include it in another config file if possible. Edit: forgot to mention that this is from Python 回答1: You can get a lot of info from environment variables: import os print os.getenv('APPLICATION_ID') print os.getenv('CURRENT_VERSION_ID') print os

Writing nested schema to BigQuery from Dataflow (Python)

﹥>﹥吖頭↗ 提交于 2021-02-07 07:14:13
问题 I have a Dataflow job to write to BigQuery. It works well for non-nested schema, however fails for the nested schema. Here is my Dataflow pipeline: pipeline_options = PipelineOptions() p = beam.Pipeline(options=pipeline_options) wordcount_options = pipeline_options.view_as(WordcountTemplatedOptions) schema = 'url: STRING,' \ 'ua: STRING,' \ 'method: STRING,' \ 'man: RECORD,' \ 'man.ip: RECORD,' \ 'man.ip.cc: STRING,' \ 'man.ip.city: STRING,' \ 'man.ip.as: INTEGER,' \ 'man.ip.country: STRING,'

Writing nested schema to BigQuery from Dataflow (Python)

早过忘川 提交于 2021-02-07 07:11:37
问题 I have a Dataflow job to write to BigQuery. It works well for non-nested schema, however fails for the nested schema. Here is my Dataflow pipeline: pipeline_options = PipelineOptions() p = beam.Pipeline(options=pipeline_options) wordcount_options = pipeline_options.view_as(WordcountTemplatedOptions) schema = 'url: STRING,' \ 'ua: STRING,' \ 'method: STRING,' \ 'man: RECORD,' \ 'man.ip: RECORD,' \ 'man.ip.cc: STRING,' \ 'man.ip.city: STRING,' \ 'man.ip.as: INTEGER,' \ 'man.ip.country: STRING,'

Writing nested schema to BigQuery from Dataflow (Python)

旧街凉风 提交于 2021-02-07 07:08:21
问题 I have a Dataflow job to write to BigQuery. It works well for non-nested schema, however fails for the nested schema. Here is my Dataflow pipeline: pipeline_options = PipelineOptions() p = beam.Pipeline(options=pipeline_options) wordcount_options = pipeline_options.view_as(WordcountTemplatedOptions) schema = 'url: STRING,' \ 'ua: STRING,' \ 'method: STRING,' \ 'man: RECORD,' \ 'man.ip: RECORD,' \ 'man.ip.cc: STRING,' \ 'man.ip.city: STRING,' \ 'man.ip.as: INTEGER,' \ 'man.ip.country: STRING,'

Bigquery - schedule stored procedure not working anymore

和自甴很熟 提交于 2021-02-05 12:20:34
问题 recently there was a change in Bigquery UI and it seems that is no longer possible to schedule a stored procedure to execute automatically. Using the UI, just keeps asking to insert a destination table. If I put a dummy table, the schedule is created but then when tries to execute just throws an error saying that we can't have a destination table when executing a stored procedure. Is anyone having this issue and has any kind of workaround ? Thanks in advance. 回答1: You can opt out of the

loading a text files (.txt) in cloud storage into big query table

♀尐吖头ヾ 提交于 2021-02-05 12:17:37
问题 I have a set of text files that are uploaded every 5 minutes into the google cloud storage. I want to put them into BigQuery in every 5 minutes (because text files uploaded into Cloud Storage in every 5 min). I know text files cant to be uploaded into BigQuery. What is the best approach for this? Sample of a text file Thanks in advance. 回答1: He is an alternative approach, which will use an event-based Cloud Function to load data into BigQuery. Create a cloud function with "Trigger Type" as