Listen:
A hive job face the odd "Too many counters:" like
Ended Job = job_xxxxxx with exception 'org.apache.hadoop.mapreduce.counters.LimitExceededException(Too many counters: 201 max=200)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
Intercepting System.exit(1)
To avoid such exception, configure "mapreduce.job.counters.max" in mapreduce-site.xml to a value above 1000. Hive will fail when he is hitting the 1k counts, but other MR jobs not. A number around 1120 should be a good choice.
Using "EXPLAIN EXTENDED" and "grep -ri operators | wc -l" print out the used numbers of operators. Use this value to tweak the MR settings carefully.
Ended Job = job_xxxxxx with exception 'org.apache.hadoop.mapreduce.counters.LimitExceededException(Too many counters: 201 max=200)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
Intercepting System.exit(1)
These happens when operators are used in queries (Hive Operators). Hive creates 4 counters per operator, max upto 1000, plus a few additional counters like file read/write, partitions and tables. Hence the number of counter required is going to be dependent upon the query.
Using "EXPLAIN EXTENDED" and "grep -ri operators | wc -l" print out the used numbers of operators. Use this value to tweak the MR settings carefully.
For more information on Hadoop counters - check the blog I have written some time back (http://www.thecloudavenue.com/2011/12/limiting-usage-counters-in-hadoop.html).
ReplyDeleteAlso, there might be a reason (performance) why the number of counters are restricted in Hadoop. So, I suggest just to not increase it blindly, but to keep an eye on the performance after the changes.
Yes, thats a good point. Thanks for sharing this, Praveen :)
ReplyDelete