I have a scala code in PySpark.
val rdd =sparkContext.parallelize(( (1 to 20).map(x=>("key",x))), 4) rdd.reduceByKey(_ + _) rdd