How to explicilty define datanodes to store a particular given file in HDFS?

六眼飞鱼酱① 提交于 2019-12-31 02:41:06

问题


I want to write a script or something like .xml file which explicitly defines the datanodes in Hadoop cluster to store a particular file blocks. for example: Suppose there are 4 slave nodes and 1 Master node (total 5 nodes in hadoop cluster ). there are two files file01(size=120 MB) and file02(size=160 MB).Default block size =64MB

Now I want to store one of two blocks of file01 at slave node1 and other one at slave node2. Similarly one of three blocks of file02 at slave node1, second one at slave node3 and third one at slave node4. So,my question is how can I do this ?

actually there is one method :Make changes in conf/slaves file every time to store a file. but I don't want to do this So, there is another solution to do this ?? I hope I made my point clear. Waiting for your kind response..!!!


回答1:


There is no method to achieve what you are asking here - the Name Node will replicate blocks to data nodes based upon rack configuration, replication factor and node availability, so even if you do managed to get a block on two particular data nodes, if one of those nodes goes down, the name node will replicate the block to another node.

Your requirement is also assuming a replication factor of 1, which doesn't give you any data redundancy (which is a bad thing if you lose a data node).

Let the namenode manage block assignments and use the balancer periodically if you want to keep your cluster evenly distibuted




回答2:


NameNode is an ultimate authority to decide on the block placement. There is Jira about the requirements to make this algorithm pluggable: https://issues.apache.org/jira/browse/HDFS-385
but unfortunetely it is in the 0.21 version, which is not production (alhough working not bad at all).
I would suggest to plug you algorithm to 0.21 if you are on the research state and then wait for 0.23 to became production, or, to downgrade the code to 0.20 if you do need it now.



来源:https://stackoverflow.com/questions/10810845/how-to-explicilty-define-datanodes-to-store-a-particular-given-file-in-hdfs

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!