Hadoop: FileNotFoundExcepion when getting file from DistributedCache

拈花ヽ惹草 提交于 2019-12-13 19:04:15

问题


I’ve 2 nodes cluster (v1.04), master and slave. On the master, in Tool.run() we add two files to the DistributedCache using addCacheFile(). Files do exist in HDFS. In the Mapper.setup() we want to retrieve those files from the cache using

FSDataInputStream fs = FileSystem.get( context.getConfiguration() ).open( path ). 

The problem is that for one file a FileNotFoundException is thrown, although the file exists on the slave node:

attempt_201211211227_0020_m_000000_2: java.io.FileNotFoundException: File does not exist: /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/analytics/1.csv

ls –l on the slave:

[hduser@slave ~]$ ll /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/ analytics/1.csv                        
-rwxr-xr-x 1 hduser hadoop 42701 Nov 22 10:18 /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/ analytics/1.csv

My questions are:

  1. Shouldn't all files exist on all nodes?
  2. What should be done to fix that?

Thanks.


回答1:


Solved - should have beed used:

FileSystem.getLocal( conf ) 

Thanks to Harsh J from Hadoop mailing list.



来源:https://stackoverflow.com/questions/13508707/hadoop-filenotfoundexcepion-when-getting-file-from-distributedcache

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!