PySpark - Aggregate expression required for pivot, found 'pythonUDF'

会有一股神秘感。 提交于 2020-07-23 06:36:05

问题


I am using Python 2.6.6 and Spark 1.6.0. I have df like this:

id | name      |  number |
-------------------------- 
1  | joe       | 148590  |
2  | bob       | 148590  |
2  | steve     | 279109  |
3  | sue       | 382901  |
3  | linda     | 148590  |

Whenever I try to run something like df2 = df.groupBy('id','length','type').pivot('id').agg(F.collect_list('name')), I get the following error pyspark.sql.utils.AnalysisException: u"Aggregate expression required for pivot, found 'pythonUDF#93';" Why is this?


回答1:


Resolved. I used SQLContext to create the original data frame. Changed to HiveContext.



来源:https://stackoverflow.com/questions/62684123/pyspark-aggregate-expression-required-for-pivot-found-pythonudf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!