hadoop map reduce -archives not unpacking archives

风流意气都作罢 提交于 2019-12-06 05:11:58

the distributed cache doesn't unpack the archives files to the local working directory of your task - there's a location on each task tracker for job as a whole, and it's unpacked there.

You'll need to check the DistributedCache to find this location and look for the files there. The Javadocs for DistributedCache show an example mapper pulling this information.

You can use symbolic linking when defining the -files and -archives generic options and a symlink will be created in the local working directory of the map / reduce tasks making this easier:

hadoop jar myJar myClass -files ./etc/alice.txt#file1.txt \
    -archives ./etc/bob.zip#bob,./etc/claire.tar#claire

And then you can use the fragment names in your mapper when trying to open files in the archive:

new File("bob").isDirectory() == true
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!