问题
Please excuse me for this basic question. But I wonder why mapreduce job don't get launched when we try to load some file having size more than the block size.
Somewhere I learnt that MapReduce will take care of loading the datasets from LFS to HDFS. Then why I am not able to see mapreduce logs on the console when I give hadoop fs -put command?
thanks in Advance.
回答1:
You're thinking of hadoop distcp which will spawn a MapReduce job.
https://hadoop.apache.org/docs/stable/hadoop-distcp/DistCp.html
DistCp Version 2 (distributed copy) is a tool used for large inter/intra cluster copying. It uses MapReduce to effect its distribution, error handling and recovery, and reporting. It expands a list of files and directories into input to map tasks, each of which will copy a partition of the files specified in the source list.
hadoop fs -put
or hdfs dfs -put
are implemented entirely by HDFS and don't require MapReduce.
来源:https://stackoverflow.com/questions/44471123/why-mapreduce-doesnt-get-launched-when-using-hadoop-fs-put-command