google-bigquery

Is there a way to use ORDER BY clause in COUNT aggregate analytic function? If not, what is a suitable alternative?

穿精又带淫゛_ 提交于 2020-06-28 04:02:18
问题 I have a table of orders that looks something like this: WITH my_table_of_orders AS ( SELECT 1 AS order_id, DATE(2019, 5, 12) AS date, 5 AS customer_id, TRUE AS is_from_particular_store UNION ALL SELECT 2 AS order_id, DATE(2019, 5, 11) AS date, 5 AS customer_id, TRUE AS is_from_particular_store UNION ALL SELECT 3 AS order_id, DATE(2019, 5, 11) AS date, 4 AS customer_id, FALSE AS is_from_particular_store ) My actual table contains ~59 million rows. What I would like to do is essentially return

How to count frequency of elements in a bigquery array field

丶灬走出姿态 提交于 2020-06-28 02:43:25
问题 I have a table that looks like this: I am looking for a table that gives a frequency count of the elements in the fields l_0, l_1, l_2, l_3 . For example the output should look like this: | author_id | year | l_o.name | l_0.count| l1.name | l1.count | l2.name | l2.count| l3.name | l3.count| | 2164089123 | 1987 | biology | 3 | botany | 3 | | | | | | 2595831531 | 1987 | computer science | 2 | simulation | 2 | computer simulation | 2 | mathematical model | 2 | Edit: In some cases the array field

BigQuery JavaScript UDF process - per row or per processing node?

北城余情 提交于 2020-06-27 05:21:05
问题 I'm thinking of using BigQuery's JavaScript UDF as a critical component in a new data architecture. It would be used to logically process each row loaded into the main table, and also to process each row during periodical and ad-hoc aggregation queries. Using an SQL UDF for the same purpose seems to be unfeasible because each row represents a complex object, and implementing the business logic in SQL, including things such as parsing complex text fields, gets ugly very fast. I just read the

Multi-column input to ML.PREDICT for a TensorFlow model in BigQuery ML

生来就可爱ヽ(ⅴ<●) 提交于 2020-06-27 04:27:28
问题 I trained a TensorFlow classifier and created it as a model in BigQuery ML using CREATE MODEL . Now I would like to use ML.PREDICT to batch predict using this model. I get the error "Invalid table-valued function ml.predict Column inputs is not found in the input data to the PREDICT function." Here's my query: select * from ml.predict ( model test.digital_native_classifier_kf, (select * from dataset_id.features_table_id) ) In the BigQuery documentation, they give an example for a TensorFlow

How to Catch ST_MAKEPOLYGON Error in BigQuery

做~自己de王妃 提交于 2020-06-24 16:02:01
问题 I am using ST_MAKEPOLYGON function in BigQuery as follows: with data AS ( SELECT 61680 AS id, 139.74862575531006 AS lon, 35.674973127377314 AS lat union all SELECT 61680, 139.75087881088257, 35.673909836018375 union all SELECT 61680, 139.747037887573, 35.6765767531247 union all SELECT 61680, 139.75308895111, 35.6813525780394 union all SELECT 61680, 139.747509956359, 35.6798884869144 union all SELECT 61680, 139.754590988159, 35.6799930657428 union all SELECT 61680, 139.754977226257, 35

How to Catch ST_MAKEPOLYGON Error in BigQuery

梦想与她 提交于 2020-06-24 16:00:32
问题 I am using ST_MAKEPOLYGON function in BigQuery as follows: with data AS ( SELECT 61680 AS id, 139.74862575531006 AS lon, 35.674973127377314 AS lat union all SELECT 61680, 139.75087881088257, 35.673909836018375 union all SELECT 61680, 139.747037887573, 35.6765767531247 union all SELECT 61680, 139.75308895111, 35.6813525780394 union all SELECT 61680, 139.747509956359, 35.6798884869144 union all SELECT 61680, 139.754590988159, 35.6799930657428 union all SELECT 61680, 139.754977226257, 35

GCP can't write Biq query using by to_gbq

非 Y 不嫁゛ 提交于 2020-06-23 14:12:32
问题 Can't write Biq Query by following Error. Python 3.5.6 pandas-gbq 0.13.1 google-cloud-bigquery 1.24.0 ImportError: pandas-gbq requires google-cloud-bigquery: cannot import name 'TimeoutGuard' code: sample_dataframe = pd.DataFrame(data_rows,columns['shop_name','category','nearest_station','telephone_number','address','DL_time']) print(sample_dataframe) sample_dataframe.to_gbq('NTT.aikidou2025', 'robotic-column-270803',if_exists = 'replace') 回答1: I tried !pip install google-cloud-bigquery==1.10

GCP can't write Biq query using by to_gbq

天大地大妈咪最大 提交于 2020-06-23 14:12:17
问题 Can't write Biq Query by following Error. Python 3.5.6 pandas-gbq 0.13.1 google-cloud-bigquery 1.24.0 ImportError: pandas-gbq requires google-cloud-bigquery: cannot import name 'TimeoutGuard' code: sample_dataframe = pd.DataFrame(data_rows,columns['shop_name','category','nearest_station','telephone_number','address','DL_time']) print(sample_dataframe) sample_dataframe.to_gbq('NTT.aikidou2025', 'robotic-column-270803',if_exists = 'replace') 回答1: I tried !pip install google-cloud-bigquery==1.10

What is the transaction isolation level in BigQuery

那年仲夏 提交于 2020-06-23 07:44:05
问题 Can anyone help explain what is the transaction isolation level in google cloud BigQuery? It does not appear to be any documents on this. We know that in other databases, e.g. sql server database, there are transaction isolation levels : read uncommitted, read committed, repeatable read, snapshot, serialisable. Thanks. 回答1: There is not much information about it, but in this migration guide we can find some comparisons between BigQuery and Teradata which can give us a clue. As its said in the

How to use bigquery round up results to 4 digits after decimal point?

独自空忆成欢 提交于 2020-06-23 06:29:30
问题 We don't have decimal data type in BigQuery now. So I have to use float But In Bigquery float division 0.029*50/100=0.014500000000000002 Although 0.021*50/100=0.0105 To round the value up I have to use round(floatvalue*10000)/10000. Is this the right way to deal with decimal data type now in BigQuery? 回答1: Note that this question deserves a different answer now. The premise of the question is "We don't have decimal data type in BigQuery now." But now we do: You can use NUMERIC : SELECT CAST(