Pyspark: Parse a column of json strings

前端未结

关注

 4  1353

忘掉有多难 2020-11-27 15:25

I have a pyspark dataframe consisting of one column, called json, where each row is a unicode string of json. I\'d like to parse each row and return a new dataf

4条回答

刺人心 (楼主)

2020-11-27 15:46
Here's a concise (spark SQL) version of @nolan-conaway's parseJSONCols function.
```
SELECT 
explode(
    from_json(
        concat('{"data":', 
               '[{"a": 1.0,"b": 1},{"a": 0.0,"b": 2}]', 
               '}'), 
        'data array>'
    ).data) as data;
```
PS. I've added the explode function as well :P

You'll need to know some HIVE SQL types
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...