HDFS can I specify replication factor per file to increase avaliability

倖福魔咒の 提交于 2020-01-14 06:38:13

问题


I'm newbie in HDFS, so sorry if my question is so naive.

Suppose we store files in a Hadoop cluster. Some files are really popular and will be requested very often(but not so often to put them in memory) than the other. It worth to keep more copies(replicas) of that files.

Can I implement it in HDFS or is there any best practice to tackle this task?


回答1:


Yes, you can do it for entire cluster/directory/file individually.

You can change the replication factor(lets say 3) on a per-file basis using the Hadoop FS shell.

[sys@localhost ~]$ hadoop fs –setrep –w 3 /my/file

Alternatively, you can change the replication factor(lets say 3) of all the files under a directory.

[sys@localhost ~]$ hadoop fs –setrep –w 3 -R /my/dir

To change replication of entire HDFS to 1:

[sys@localhost ~]$ hadoop fs -setrep -w 1 -R /

But the replication factor should lie between dfs.replication.max and dfs.replication.min value.



来源:https://stackoverflow.com/questions/37111653/hdfs-can-i-specify-replication-factor-per-file-to-increase-avaliability

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!