FileNotFoundException on hadoop

♀尐吖头ヾ 提交于 2020-01-03 05:56:10

问题


Inside my map function, I am trying to read a file from the distributedcache, load its contents into a hash map.

The sys output log of the MapReduce job prints the content of the hashmap. This shows that it has found the file, has loaded into the data structure and performed the needed operation. It iterates through the list and prints its contents. Thus proving that the operation was successful.

However, I still get the below error after a few minutes of running the MR job:

13/01/27 18:44:21 INFO mapred.JobClient: Task Id : attempt_201301271841_0001_m_000001_2, Status : FAILED
java.io.FileNotFoundException: File does not exist: /app/hadoop/jobs/nw_single_pred_in/predict
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:1834)
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
    at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:67)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)

Here's the portion which initializes Path with the location of the file to be placed in the distributed cache


    // inside main, surrounded by try catch block, yet no exception thrown here
        Configuration conf = new Configuration();
        // rest of the stuff that relates to conf
        Path knowledgefilepath = new Path(args[3]); // args[3] = /app/hadoop/jobs/nw_single_pred_in/predict/knowledge.txt
        DistributedCache.addCacheFile(knowledgefilepath.toUri(), conf);
        job.setJarByClass(NBprediction.class);
        // rest of job settings 
        job.waitForCompletion(true); // kick off load

This one is inside the map function:


    try {
    System.out.println("Inside try !!");
    Path files[]= DistributedCache.getLocalCacheFiles(context.getConfiguration());
    Path cfile = new Path(files[0].toString()); // only one file
    System.out.println("File path : "+cfile.toString());
    CSVReader reader = new CSVReader(new FileReader(cfile.toString()),'\t');
    while ((nline=reader.readNext())!=null)
    data.put(nline[0],Double.parseDouble(nline[1])); // load into a hashmap
    }
    catch (Exception e)
    {// handle exception }

Help appreciated.

Cheers !


回答1:


Did a fresh installation of hadoop and ran the job with the same jar, the problem disappeared. Seems to be a bug rather than programming errors.



来源:https://stackoverflow.com/questions/14553296/filenotfoundexception-on-hadoop

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!