expand a JSON data into new columns in a generic fashion in Redshift

纵饮孤独 提交于 2019-12-25 05:20:26

问题


I have a DB table like

SomeSchema

ID      Params
1234    {'normalized_CR': 1.111434628975265, 'Rating': 0.0, Rank': 1410}
1235    {'normalized_CR': 1.123142131, 'Rating': 1.0, Rank': 210}

How can I expand this data into individual columns by same name in Redshift?

I am googling online, but get results mostly for json_extract_path which can get only one key.


回答1:


After much googling, turns out that there is no simple way to do this as of now, and the brute force way is the way ahead. Also, the data above was not valid Json (' instead "):

select
  id,
  json_extract_path_text(REPLACE(Params, '\'', '"'), 'normalized_CR') as normalized_CR,
  json_extract_path_text(REPLACE(Params, '\'', '"'), 'Rating') as Rating,
  json_extract_path_text(REPLACE(Params, '\'', '"'), 'Rank') as Rank
from
    DB.SomeSchema
order by
    id desc
limit 100;



回答2:


Using json_extract_path_text as described in the other answer is probably the most straightforward way to go.

If you need more flexibility, an alternative approach is to create a user defined function and use Python's JSON parser to extract what you want.

Something like this (untested):

CREATE FUNCTION extract_json(json_string VARCHAR, field VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
import json
return json.loads(json_string)[field]
$$ LANGUAGE plpythonu;

Use it like:

SELECT extract_json(Params, "Rank")
FROM SomeSchema;

More info.




回答3:


An alternative approach (though it involves S3) is to use UNLOAD command to download the file in S3 and then use COPY command with option 'COPY FROM JSON'.

UNLOAD command

COPY FROM JSON command



来源:https://stackoverflow.com/questions/41979669/expand-a-json-data-into-new-columns-in-a-generic-fashion-in-redshift

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!