How do I update a bigquery table with an array?

陌路散爱 提交于 2020-08-26 07:51:13

问题


I have a table with the log data and I want to update it with the results from the subsequent query which will insert the results against the filtered row.

I want to use a union all to keep the current values and append the new ones but I get the following error:

Correlated subqueries that reference other tables are not supported unless they can be de-correlated, such as by transforming them into an efficient JOIN.

UPDATE LOGGING.table_logs a
SET a.pinged = ARRAY(
      (SELECT AS STRUCT 
      CURRENT_TIMESTAMP() as date,b.size_bytes,timestamp_millis(b.last_modified_time) AS last_modified_time,b.row_count
      FROM  `<DATASETNAME>.__TABLES__` b WHERE table_id = CONCAT("ga_sessions_intraday_",FORMAT_DATE("%Y%m%d", CURRENT_DATE())))

      )

WHERE table_id = CONCAT("ga_sessions_intraday_",FORMAT_DATE("%Y%m%d", CURRENT_DATE()))

回答1:


Below is not tested at all and is just based on [hopefully] correct shuffling your syntax around - so it avoids issue of "correlated subqueries that reference other tables"

UPDATE LOGGING.table_logs a
SET a.pinged = ARRAY(
  SELECT AS STRUCT 
    CURRENT_TIMESTAMP() AS DATE,
    b.size_bytes,
    TIMESTAMP_MILLIS(b.last_modified_time) AS last_modified_time,
    b.row_count
)
FROM  `<DATASETNAME>.__TABLES__` b 
WHERE a.table_id = b.table_id
AND a.table_id = CONCAT("ga_sessions_intraday_",FORMAT_DATE("%Y%m%d", CURRENT_DATE())) 

Please check and let me know if it works now or still some adjustments needed

P.S. Obviously, above assumes that the rest of logic is correct

Update for: How do I retain what's already in a.pinged and update it with what is there currently plus the result of the query?

Try below

UPDATE LOGGING.table_logs a
SET a.pinged = ARRAY_CONCAT(a.pinged, ARRAY(
  SELECT AS STRUCT 
    CURRENT_TIMESTAMP() AS DATE,
    b.size_bytes,
    TIMESTAMP_MILLIS(b.last_modified_time) AS last_modified_time,
    b.row_count
))
FROM  `<DATASETNAME>.__TABLES__` b 
WHERE a.table_id = b.table_id
AND a.table_id = CONCAT("ga_sessions_intraday_",FORMAT_DATE("%Y%m%d", CURRENT_DATE()))



回答2:


I feel you might have overly simplified your query, which doesn't look correlated subquery to me, but in case you don't, my feeling is your subquery always generate same array regardless of which row LOGGING.table_logs has. You may be able to save the array first in a script variable and set it later:

DECLARE field_value DEFAULT 
ARRAY(
      (SELECT AS STRUCT 
      CURRENT_TIMESTAMP() as date,b.size_bytes,timestamp_millis(b.last_modified_time) AS last_modified_time,b.row_count
      FROM  `<DATASETNAME>.__TABLES__` b WHERE table_id = CONCAT("ga_sessions_intraday_",FORMAT_DATE("%Y%m%d", CURRENT_DATE()))));

UPDATE LOGGING.table_logs a
SET a.pinged = field_value
WHERE table_id = CONCAT("ga_sessions_intraday_",FORMAT_DATE("%Y%m%d", CURRENT_DATE()))


来源:https://stackoverflow.com/questions/60764187/how-do-i-update-a-bigquery-table-with-an-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!