This is a conceptual question involving Hadoop/HDFS. Lets say you have a file containing 1 billion lines. And for the sake of simplicity, lets consider that each line is of
FileInputFormat is the abstract class which defines how the input files are read and spilt up. FileInputFormat provides following functionalites: 1. select files/objects that should be used as input 2. Defines inputsplits that breaks a file into task.
As per hadoopp basic functionality, if there are n splits then there will be n mapper.