How I can get the name of the input file within a mapper? I have multiple input files stored in the input directory, each mapper may read a different file, and I need to kno
This helped me:
String fileName = ((org.apache.hadoop.mapreduce.lib.input.FileSplit) context.getInputSplit()).getPath().getName();
Use this inside your mapper :
FileSplit fileSplit = (FileSplit)context.getInputSplit();
String filename = fileSplit.getPath().getName();
Edit :
Try this if you want to do it inside configure() through the old API :
String fileName = new String();
public void configure(JobConf job)
{
filename = job.get("map.input.file");
}
You have to first convert in to InputSplit by typecasting and then you need to type cast to FileSplit.
Example:
InputSplit inputSplit= (InputSplit)context.getInputSplit();
Path filePath = ((FileSplit) inputSplit).getPath();
String filePathString = ((FileSplit) context.getInputSplit()).getPath().toString()
First you need to get the input split, using the newer mapreduce API it would be done as follows:
context.getInputSplit();
But in order to get the file path and the file name you will need to first typecast the result into FileSplit.
So, in order to get the input file path you may do the following:
Path filePath = ((FileSplit) context.getInputSplit()).getPath();
String filePathString = ((FileSplit) context.getInputSplit()).getPath().toString();
Similarly, to get the file name, you may just call upon getName(), like this:
String fileName = ((FileSplit) context.getInputSplit()).getPath().getName();