AWS Athena json_extract query from string field returns empty values

天涯浪子 提交于 2019-12-10 19:06:49

问题


I have a table in athena with this structure

CREATE EXTERNAL TABLE `json_test`(
  `col0` string , 
  `col1` string , 
  `col2` string , 
  `col3` string , 
  `col4` string , 
  )
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.OpenCSVSerde' 
WITH SERDEPROPERTIES ( 
  'quoteChar'='\"', 
  'separatorChar'='\;') 

A Json String like this is stored in "col4":

{'email': 'test_email@test_email.com', 'name': 'Andrew', 'surname': 'Test Test'}

I´m trying to make a json_extract query:

SELECT json_extract(col4 , '$.email') as email FROM "default"."json_test"

But the query returns empty values.

Any help would be appreciated.


回答1:


The JSON needs to use double quotes (") for enclosing values.

Compare:

presto> SELECT json_extract('{"email": "test_email@test_email.com", "name": "Andrew"}' , '$.email');
            _col0
-----------------------------
 "test_email@test_email.com"

and

presto> SELECT json_extract('{''email'': ''test_email@test_email.com'', ''name'': ''Andrew''}', '$.email');
 _col0
-------
 NULL

(Note: '' inside SQL varchar literal mean single ' in the constructed value, so the literal here is the same format that in the question.)

If your string value is a "JSON with single quotes", you can try to fix it with replace(string, search, replace) → varchar




回答2:


The problem was the single quote char of the json string stored

{'email': 'test_email@test_email.com', 'name': 'Andrew', 'surname': 'Test Test'}

Changing to double quote

{"email": "test_email@test_email.com", "name": "Andrew", "surname": "Test Test"}

Athena Query works properly:

SELECT json_extract(col4 , '$.email') as email FROM "default"."json_test"


来源:https://stackoverflow.com/questions/50906730/aws-athena-json-extract-query-from-string-field-returns-empty-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!