Building a row from a dict in pySpark

前端未结

关注

 2  933

小蘑菇 2021-02-01 03:17

I\'m trying to dynamically build a row in pySpark 1.6.1, then build it into a dataframe. The general idea is to extend the results of describe to include, for exam

2条回答

野性不改 (楼主)

2021-02-01 04:09
You can use keyword arguments unpacking as follows:
```
Row(**row_dict)

## Row(C0=-1.1990072635132698, C3=0.12605772684660232, C4=0.5760856026559944, 
##     C5=0.1951877800894315, C6=24.72378589441825, summary='kurtosis')
```
It is important to note that it internally sorts data by key to address problems with older Python versions.

This behavior is likely to be removed in the upcoming releases - see SPARK-29748 Remove sorting of fields in PySpark SQL Row creation. Once it is remove you'll have to ensure that the order of values in the dict is consistent across records.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...