Understanding MapReduce and Spark, the major difference turns out that Spark requires less I/O due to maintaining "narrow" lineage i.e. taking the input o