More than 120 counters in hadoop

戏子无情 提交于 2019-12-01 04:01:08

You can override that property in mapred-site.xml on your JT, TT, client nodes but make sure that this will be a system-wide modification:

<configuration>
  ...
  <property>
    <name>mapreduce.job.counters.limit</name>
    <value>500</value>
  </property>
  ...
</configuration>

Then restart the mapreduce service on your cluster.

In Hadoop 2, this configuration parameter is called

mapreduce.job.counters.max

Setting it on the command line or in your Configuration object isn't enough, though. You need to call the static method

org.apache.hadoop.mapreduce.counters.Limits.init()

in the setup() method of your mapper or reducer to get the setting to take effect.

Tested with 2.6.0 and 2.7.1.

The para is set by config file, while paras below will take effect

mapreduce.job.counters.max=1000 
mapreduce.job.counters.groups.max=500 
mapreduce.job.counters.group.name.max=1000 
mapreduce.job.counters.counter.name.max=500 

Just adding this in case anyone else faces the same problem we did: increasing the counters from with MRJob.

To raise the number of counters, add emr_configurations to your mrjob.conf (or pass it to MRJob as a config parameter):

runners:
  emr:
    emr_configurations:
      - Classification: mapred-site
        Properties:
          mapreduce.job.counters.max: 1024
          mapreduce.job.counters.counter.name.max: 256
          mapreduce.job.counters.groups.max: 256
          mapreduce.job.counters.group.name.max: 256
Vijayanand

We can customize the limits as command line options only for specific jobs, instead of making change in mapred-site.xml.

-Dmapreduce.job.counters.limit=x
-Dmapreduce.job.counters.groups.max=y

NOTE: x and y are custom values based on your environment/requirement.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!