Result of hdfs dfs -ls command

只谈情不闲聊 提交于 2019-12-12 03:48:09

问题


In the execution of hdfs dfs -ls command I wuold like to know if the result are all the files stored in the cluster or just the partitions in the node where it is executed. I'm a newby in hadoop and I´m having some problems serching the partitions in each node.

Thank you


回答1:


Question: "...if the result are all the files stored in the cluster or..."

What you see from ls command are all the files stored in the cluster. More specifically, what you see is a bunch of file paths and names. These information is part of namespace, which is managed by a Namenode.

"...just the partitions in the node where it is executed.."

If you thought hdfs keeps some files on this node, and some files on the other node. You misunderstood. There's no such thing. NameNode keeps tracks of namespace, and blocksMap. In fact, Files are composed of blocks. NameNode knows the file has how many blocks and on which DataNodes the blocks are kept. NameNode decides where the blocks are kept, it's transparent to the user. Each block has 3 replication by default, and each replication is on one DataNode. So Assume a file has 2 blocks, it could be located on at most 6 DataNodes, No DataNode keeps the complete files(true in this example. Because in another common case when a small file has only 1 block, each replication is a complete file).

For more information, take a look at the official document of Hdfs Design



来源:https://stackoverflow.com/questions/37381103/result-of-hdfs-dfs-ls-command

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!