I am going through hadoop definitive guide, where it clearly explains about input splits. It goes like
Input splits doesn’t contain actual data, rath
Hadoop framework strength is its data locality.So whenever a client request for the hdfs data, framework always checks for the locality else it looks for little I/O utilization.