Change the size of random data generation on Hadoop

笑着哭i 提交于 2021-01-29 17:18:23

问题


I am running sort example on Hadoop using RandomWriter function. This particular function writes 10 gig (by default) of random data/host to DFS using Map/Reduce.

bin/hadoop jar hadoop-*-examples.jar randomwriter <out-dir>.

Can anyone please tell how can I change the size 10GB of RandomWriter function?


回答1:


That example have some configurable parameters. These parameters are given to jar in a config file. To run use it as (suppling a config file)

bin/hadoop jar hadoop-*-examples.jar randomwriter <out-dir> [<configuration file>]

or run it with parameters as

bin/hadoop jar hadoop-*-examples.jar randomwriter 
 -Dtest.randomwrite.bytes_per_map=<value> 
 -Dtest.randomwriter.maps_per_host=<value> <out-dir> [<configuration file>]

For details about all configurable parameters see : https://wiki.apache.org/hadoop/RandomWriter




回答2:


On Hadoop 2 (at least on the 2.7.2 version), the properties are now mapreduce.randomwriter.mapsperhost and mapreduce.randomwriter.bytespermap.

You can saw them on http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/RandomWriter.java?view=markup

So the correct answer on recent Hadoop 2 version is

bin/hadoop jar hadoop-*-examples.jar randomwriter 
 -Dmapreduce.randomwriter.bytespermap=<value> 
 -Dmapreduce.randomwriter.mapsperhost=<value> <out-dir> [<configuration file>]


来源:https://stackoverflow.com/questions/22053594/change-the-size-of-random-data-generation-on-hadoop

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!