How to extract selected values from json string in Hive

我们两清 提交于 2019-12-22 05:23:28

问题


I am running a simple query in Hive that produces the following output (with a few other additional columns.

|------|-----------------------------------------------------------|
| col1 | col2                                                      |
|------|-----------------------------------------------------------|
|   A  | {"variable1":123,"variable2":456,"variable3":789}         |                                          
|------|-----------------------------------------------------------|
|   B  | {"variable1":222,"variable2":333,"variable3":444}         |
--------------------------------------------------------------------

I need to be able to parse the json string and pull out the values for each token during the SELECT statement itself so that I can perhaps incorporate a WHERE statement to return only the parts of the string that are valuable to me.

So my ultimate output might look like this:

|------------------------------------------|
| col1 |variable1 | variable2 | variable3  |                                      
|------------------------------------------|
|  A   |   123    |    456    |    789     |                                    
|------------------------------------------|
|  B   |   222    |    333    |    444     |
--------------------------------------------

I have tried using various functions in including SPLIT and GET_JSON_OBJECT using the argument structure specified in the esnaples yet all return errors such as:

No matching method for class org.apache.hadoop.hive.ql.udf.UDFJson 
with (struct<...>, string). Possible choices: _FUNC_(string, string)

Could someone please tell if what I am trying to do is feasible, or explain where I am going wrong?

Thanks in advance


回答1:


select col1, get_json_object(col2,'$.variable1') as variable1,
get_json_object(col2,'$.variable2') as variable2,
get_json_object(col2,'$.variable3') as variable3 
from json_test

If you put your output into a table (say json_test), you can parse in this way. You can tweak your query too to obtain these results.

Output:

col1 |variable1 |variable2 |variable3 |
-----|----------|----------|----------|
A    |123       |456       |789       |
B    |222       |333       |444       |



回答2:


Step1:

create table in HIVE
create table json_student(student string)  

-----load data in this table
hive>select * from json_variable;`enter code here`
       {"col1":"A","variable1":123,"variable2":456,"variable3":789}
       {"col1":"B","variable1":222,"variable2":333,"variable3":444}

Step2:

create table  json_variable1(col1 string,variable1 int,variable2 int,variable3 int);

Step3:

insert overwrite table json_variable1 
select get_json_object(variable,'$.col1'),get_json_object(variable,'$.variable1'),get_json_object(variable,'$.variable2'),get_json_object(variable,'$.variable3') from json_variable;

hive> Select * from json_variable1;    
A       123     456     789`    `
B       222     333     444`

`


来源:https://stackoverflow.com/questions/45514710/how-to-extract-selected-values-from-json-string-in-hive

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!