问题
I am using Python 2.6.6 and Spark 1.6.0. I have df like this:
id | name | number |
--------------------------
1 | joe | 148590 |
2 | bob | 148590 |
2 | steve | 279109 |
3 | sue | 382901 |
3 | linda | 148590 |
Whenever I try to run something like
df2 = df.groupBy('id','length','type').pivot('id').agg(F.collect_list('name'))
,
I get the following error pyspark.sql.utils.AnalysisException: u"Aggregate expression required for pivot, found 'pythonUDF#93';"
Why is this?
回答1:
Resolved. I used SQLContext to create the original data frame. Changed to HiveContext.
来源:https://stackoverflow.com/questions/62684123/pyspark-aggregate-expression-required-for-pivot-found-pythonudf