Why is the fold action necessary in Spark?
I've a silly question involving fold and reduce in PySpark . I understand the difference between these two methods, but, if both need that the applied function is a commutative monoid, I cannot figure out an example in which fold cannot be substituted by reduce`. Besides, in the PySpark implementation of fold it is used acc = op(obj, acc) , why this operation order is used instead of acc = op(acc, obj) ? (this second order sounds more closed to a leftFold to me) Cheers Tomas Empty RDD It cannot be substituted when RDD is empty: val rdd = sc.emptyRDD[Int] rdd.reduce(_ + _) // java.lang