Data lost after shutting down hadoop HDFS?

╄→尐↘猪︶ㄣ 提交于 2020-01-11 07:05:40

问题


Hi I'm learning hadoop and I have a simple dumb question: After I shut down HDFS(by calling hadoop_home/sbin/stop-dfs.sh), is the data on HDFS lost or can I get it back?


回答1:


Data wouldn't be lost if you stop HDFS, provided you store the data of NameNode and DataNode's in a persistent locations specified using the properties:

  • dfs.namenode.name.dir -> Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. Default value: file://${hadoop.tmp.dir}/dfs/name
  • dfs.datanode.data.dir -> Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored. Default value: file://${hadoop.tmp.dir}/dfs/data

As you could see, the default values for both properties point to ${hadoop.tmp.dir} which by default is /tmp. You might already know that the data in /tmp in Unix based systems get's cleared on reboot's.

So, if you would specify dir location's other than /tmp then Hadoop HDFS daemons on reboot would be able to read back the data and hence no data loss even on cluster restart's.




回答2:


Please make sure you are not deleting metadata of your data stored in HDFS and this you can achieve simply if you are keeping dfs.namenode.name.dir and dfs.datanode.data.dir untouced, means not deleting path present in these tags which present in your hdfs-site.xml file.



来源:https://stackoverflow.com/questions/28379048/data-lost-after-shutting-down-hadoop-hdfs

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!