Getting “file does not exist” error when running an Amazon EMR job

问题

I have uploaded my data genotype1_large_ind_large.txt phenotype1_large_ind_large_1.txt

to the S3 system, and in the EMR UI, I set the parameter like below

RunDear.run s3n://scalability/genotype1_large_ind_large.txt s3n://scalability/phenotype1_large_ind_large_1.txt s3n://scalability/output_1phe 33 10 4

In my class RunDear.run I will distribute the file genotype1_large_ind_large.txt and phenotype1_large_ind_large_1.txt to the cache

However, after running the EMR, I get the following error: java.io.FileNotFoundException: File does not exist: /genotype1_large_ind_large.txt

I am wondering why there is slash '/' in front of the file name? how to make it work?

I also tried to use like below, but my program will take -cacheFile as an argument, thus also does not work,

RunDear.run -cacheFile s3n://scalability/genotype1_large_ind_large.txt#genotype.txt -cacheFile s3n://scalability/phenotype1_large_ind_large_1.txt#phenotype.txt s3n://scalability/output_1phe 33 280 4

回答1:

I finally realize it is the problem of using the filesystem, so I add a code in the program like below FileSystem fs = FileSystem.get( URI.create("s3://scalability"), conf);

来源：https://stackoverflow.com/questions/10315378/getting-file-does-not-exist-error-when-running-an-amazon-emr-job

标签

amazon-web-services

amazon-emr

emr

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!