How to get the input file name in the mapper in a Hadoop program?

后端 未结 10 2031
粉色の甜心
粉色の甜心 2020-11-29 18:48

How I can get the name of the input file within a mapper? I have multiple input files stored in the input directory, each mapper may read a different file, and I need to kno

10条回答
  •  执念已碎
    2020-11-29 19:19

    Noticed on Hadoop 2.4 and greater using the old api this method produces a null value

    String fileName = new String();
    public void configure(JobConf job)
    {
       fileName = job.get("map.input.file");
    }
    

    Alternatively you can utilize the Reporter object passed to your map function to get the InputSplit and cast to a FileSplit to retrieve the filename

    public void map(LongWritable offset, Text record,
            OutputCollector out, Reporter rptr)
            throws IOException {
    
        FileSplit fsplit = (FileSplit) rptr.getInputSplit();
        String inputFileName = fsplit.getPath().getName();
        ....
    }
    

提交回复
热议问题