Running a simple app in pyspark.
f = sc.textFile(\"README.md\") wc = f.flatMap(lambda x: x.split(\' \')).map(lambda x: (x, 1)).reduceByKey(add)
You can simply collect the entire RDD (which will return a list of rows) and print said list:
print(wc.collect())