Running a simple app in pyspark.
f = sc.textFile(\"README.md\") wc = f.flatMap(lambda x: x.split(\' \')).map(lambda x: (x, 1)).reduceByKey(add)
In Spark 2.0 (I didn't tested with earlier versions). Simply:
print myRDD.take(n)
Where n is the number of lines and myRDD is wc in your case.