I am writing a Spark application and want to combine a set of Key-Value pairs (K, V1), (K, V2), ..., (K, Vn) into one Key-Multivalue pair (K, [V1, V2, ...
The error message stems from the type for 'a' in your closure.
My_KMV = My_KV.reduce(lambda a, b: a.append([b]))
Let pySpark explicitly evaluate a as a list. For instance,
My_KMV = My_KV.reduceByKey(lambda a,b:[a].extend([b]))
In many cases, reduceByKey will be preferable to groupByKey, refer to: http://databricks.gitbooks.io/databricks-spark-knowledge-base/content/best_practices/prefer_reducebykey_over_groupbykey.html