PySpark: Map a SchemaRDD into a SchemaRDD

后端未结

关注

 4  923

猫巷女王i 2021-01-07 10:21

I am loading a file of JSON objects as a PySpark SchemaRDD. I want to change the \"shape\" of the objects (basically, I\'m flattening them) and then insert into

4条回答

粉色の甜心 (楼主)

2021-01-07 10:57
The solution is applySchema:
```
mapped = log_json.map(flatten_function)
hive_context.applySchema(mapped, flat_schema).insertInto(name)
```
Where flat_schema is a StructType representing the schema in the same way as you would obtain from log_json.schema() (but flattened, obviously).
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...