问题
I am quite new to Google BigQuery and definitely struggling.
My table has the following content:
+----------+----------------------------------------+
| order_id | line_items |
+----------+----------------------------------------+
| 123 | id:1|qy:1|sum:1.00;id:2|qy:6|sum:4.50; |
+----------+----------------------------------------+
| 456 | id:1|qy:3|sum:3.00;id:3|qy:4|sum:3.20; |
+----------+----------------------------------------+
I would need to look it like this:
+----------+----+----+------+
| order_id | id | qy | sum |
+----------+----+----+------+
| 123 | 1 | 1 | 1.00 |
| 123 | 2 | 6 | 4.50 |
| 456 | 1 | 3 | 3.00 |
| 456 | 3 | 4 | 3.20 |
+----------+----+----+------+
The amount of key value pairs I have in line_items is arbitrary (and there are much more than those 3, but I would need to extract those three).
I was able to get the following UNNEST and SPLIT query working, but unfortunately I still have these key-value pairs...
This
SELECT
order_id,
line_items
FROM
`myTable`,
UNNEST(SPLIT(line_items,"|")) line_items
Brings me here:
+----------+------------+
| order_id | line_items |
+----------+------------+
| 123 | id:1 |
| 123 | qy:1 |
| 123 | sum:1.00 |
| 123 | id:2 |
| 123 | qy:6 |
| 123 | sum:4.50; |
| 456 | id:1 |
| 456 | qy:3 |
| 456 | sum:3.00 |
| 456 | id:3 |
| 456 | qy:4 |
| 456 | sum:3.20 |
+----------+------------+
So I am still not really able, how to extract these keys to column headlines and the value to the column content.
I would highly appreciate if someone pointed me in the right direction.
Thanks a lot already!
回答1:
Below is for BigQuery Standard SQL
#standardSQL
select order_id,
( select split(kv, ':')[offset(1)] from x.kvs kv where split(kv, ':')[offset(0)] = 'id') id,
( select split(kv, ':')[offset(1)] from x.kvs kv where split(kv, ':')[offset(0)] = 'qy') qy,
( select split(kv, ':')[offset(1)] from x.kvs kv where split(kv, ':')[offset(0)] = 'sum') sum
from `project.dataset.table`,
unnest(split(trim(line_items, ';'), ';')) items,
unnest([struct(split(items,'|') as kvs)]) x
-- order by order_id
If to apply to sample data from your question - output is
Below variation of above can be useful too
#standardSQL
select order_id,
(select value from z.y where key = 'id') id,
(select value from z.y where key = 'qy') qy,
(select value from z.y where key = 'sum') sum
from `project.dataset.table`,
unnest(split(trim(line_items, ';'), ';')) items,
unnest([struct(split(items,'|') as kvs)]) x,
unnest([struct(array(
select as struct
split(kv, ':')[offset(0)] as key,
split(kv, ':')[offset(1)] value
from x.kvs kv
) as y)]) z
-- order by order_id
来源:https://stackoverflow.com/questions/64948819/split-key-value-pairs-to-columns-in-google-bigquery