expand a JSON data into new columns in a generic fashion in Redshift

问题

I have a DB table like

SomeSchema

ID      Params
1234    {'normalized_CR': 1.111434628975265, 'Rating': 0.0, Rank': 1410}
1235    {'normalized_CR': 1.123142131, 'Rating': 1.0, Rank': 210}

How can I expand this data into individual columns by same name in Redshift?

I am googling online, but get results mostly for json_extract_path which can get only one key.

回答1:

After much googling, turns out that there is no simple way to do this as of now, and the brute force way is the way ahead. Also, the data above was not valid Json (' instead "):

select
  id,
  json_extract_path_text(REPLACE(Params, '\'', '"'), 'normalized_CR') as normalized_CR,
  json_extract_path_text(REPLACE(Params, '\'', '"'), 'Rating') as Rating,
  json_extract_path_text(REPLACE(Params, '\'', '"'), 'Rank') as Rank
from
    DB.SomeSchema
order by
    id desc
limit 100;

回答2:

Using json_extract_path_text as described in the other answer is probably the most straightforward way to go.

If you need more flexibility, an alternative approach is to create a user defined function and use Python's JSON parser to extract what you want.

Something like this (untested):

CREATE FUNCTION extract_json(json_string VARCHAR, field VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
import json
return json.loads(json_string)[field]
$$ LANGUAGE plpythonu;

Use it like:

SELECT extract_json(Params, "Rank")
FROM SomeSchema;

More info.

回答3:

An alternative approach (though it involves S3) is to use UNLOAD command to download the file in S3 and then use COPY command with option 'COPY FROM JSON'.

UNLOAD command

COPY FROM JSON command

来源：https://stackoverflow.com/questions/41979669/expand-a-json-data-into-new-columns-in-a-generic-fashion-in-redshift

标签

json

amazon-redshift

Redash