问题
I am running a simple query in Hive that produces the following output (with a few other additional columns.
|------|-----------------------------------------------------------|
| col1 | col2 |
|------|-----------------------------------------------------------|
| A | {"variable1":123,"variable2":456,"variable3":789} |
|------|-----------------------------------------------------------|
| B | {"variable1":222,"variable2":333,"variable3":444} |
--------------------------------------------------------------------
I need to be able to parse the json string and pull out the values for each token during the SELECT statement itself so that I can perhaps incorporate a WHERE statement to return only the parts of the string that are valuable to me.
So my ultimate output might look like this:
|------------------------------------------|
| col1 |variable1 | variable2 | variable3 |
|------------------------------------------|
| A | 123 | 456 | 789 |
|------------------------------------------|
| B | 222 | 333 | 444 |
--------------------------------------------
I have tried using various functions in including SPLIT and GET_JSON_OBJECT using the argument structure specified in the esnaples yet all return errors such as:
No matching method for class org.apache.hadoop.hive.ql.udf.UDFJson
with (struct<...>, string). Possible choices: _FUNC_(string, string)
Could someone please tell if what I am trying to do is feasible, or explain where I am going wrong?
Thanks in advance
回答1:
select col1, get_json_object(col2,'$.variable1') as variable1,
get_json_object(col2,'$.variable2') as variable2,
get_json_object(col2,'$.variable3') as variable3
from json_test
If you put your output into a table (say json_test), you can parse in this way. You can tweak your query too to obtain these results.
Output:
col1 |variable1 |variable2 |variable3 |
-----|----------|----------|----------|
A |123 |456 |789 |
B |222 |333 |444 |
回答2:
Step1:
create table in HIVE
create table json_student(student string)
-----load data in this table
hive>select * from json_variable;`enter code here`
{"col1":"A","variable1":123,"variable2":456,"variable3":789}
{"col1":"B","variable1":222,"variable2":333,"variable3":444}
Step2:
create table json_variable1(col1 string,variable1 int,variable2 int,variable3 int);
Step3:
insert overwrite table json_variable1
select get_json_object(variable,'$.col1'),get_json_object(variable,'$.variable1'),get_json_object(variable,'$.variable2'),get_json_object(variable,'$.variable3') from json_variable;
hive> Select * from json_variable1;
A 123 456 789` `
B 222 333 444`
`
来源:https://stackoverflow.com/questions/45514710/how-to-extract-selected-values-from-json-string-in-hive