SPLIT key-value-pairs to Columns in Google BigQuery

血红的双手。 提交于 2021-02-10 06:56:12

问题


I am quite new to Google BigQuery and definitely struggling.

My table has the following content:

+----------+----------------------------------------+
| order_id |               line_items               |
+----------+----------------------------------------+
|      123 | id:1|qy:1|sum:1.00;id:2|qy:6|sum:4.50; |
+----------+----------------------------------------+
|      456 | id:1|qy:3|sum:3.00;id:3|qy:4|sum:3.20; |
+----------+----------------------------------------+

I would need to look it like this:

+----------+----+----+------+
| order_id | id | qy | sum  |
+----------+----+----+------+
|      123 |  1 |  1 | 1.00 |
|      123 |  2 |  6 | 4.50 |
|      456 |  1 |  3 | 3.00 |
|      456 |  3 |  4 | 3.20 |
+----------+----+----+------+

The amount of key value pairs I have in line_items is arbitrary (and there are much more than those 3, but I would need to extract those three).

I was able to get the following UNNEST and SPLIT query working, but unfortunately I still have these key-value pairs...

This

SELECT
  order_id,
  line_items
FROM
  `myTable`,
  UNNEST(SPLIT(line_items,"|")) line_items

Brings me here:

+----------+------------+
| order_id | line_items |
+----------+------------+
|      123 | id:1       |
|      123 | qy:1       |
|      123 | sum:1.00   |
|      123 | id:2       |
|      123 | qy:6       |
|      123 | sum:4.50;  |
|      456 | id:1       |
|      456 | qy:3       |
|      456 | sum:3.00   |
|      456 | id:3       |
|      456 | qy:4       |
|      456 | sum:3.20   |
+----------+------------+

So I am still not really able, how to extract these keys to column headlines and the value to the column content.

I would highly appreciate if someone pointed me in the right direction.

Thanks a lot already!


回答1:


Below is for BigQuery Standard SQL

#standardSQL
select order_id, 
  ( select split(kv, ':')[offset(1)] from x.kvs kv where split(kv, ':')[offset(0)] = 'id') id,
  ( select split(kv, ':')[offset(1)] from x.kvs kv where split(kv, ':')[offset(0)] = 'qy') qy,
  ( select split(kv, ':')[offset(1)] from x.kvs kv where split(kv, ':')[offset(0)] = 'sum') sum
from `project.dataset.table`,
unnest(split(trim(line_items, ';'), ';')) items,
unnest([struct(split(items,'|') as kvs)]) x
-- order by order_id    

If to apply to sample data from your question - output is

Below variation of above can be useful too

#standardSQL
select order_id, 
  (select value from z.y where key = 'id') id,
  (select value from z.y where key = 'qy') qy,
  (select value from z.y where key = 'sum') sum
from `project.dataset.table`,
unnest(split(trim(line_items, ';'), ';')) items,
unnest([struct(split(items,'|') as kvs)]) x,
unnest([struct(array(
  select as struct 
    split(kv, ':')[offset(0)] as key, 
    split(kv, ':')[offset(1)] value 
  from x.kvs kv
) as y)]) z
-- order by order_id 


来源:https://stackoverflow.com/questions/64948819/split-key-value-pairs-to-columns-in-google-bigquery

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!