Hive query shows ERROR "too many counters"

Listen:

A hive job face the odd "Too many counters:" like

Ended Job = job_xxxxxx with exception 'org.apache.hadoop.mapreduce.counters.LimitExceededException(Too many counters: 201 max=200)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
Intercepting System.exit(1)

These happens when operators are used in queries (Hive Operators). Hive creates 4 counters per operator, max upto 1000, plus a few additional counters like file read/write, partitions and tables. Hence the number of counter required is going to be dependent upon the query.

To avoid such exception, configure "mapreduce.job.counters.max" in mapreduce-site.xml to a value above 1000. Hive will fail when he is hitting the 1k counts, but other MR jobs not. A number around 1120 should be a good choice.

Using "EXPLAIN EXTENDED" and "grep -ri operators | wc -l" print out the used numbers of operators. Use this value to tweak the MR settings carefully.

Comments

Praveen Sripati02 December, 2012
For more information on Hadoop counters - check the blog I have written some time back (http://www.thecloudavenue.com/2011/12/limiting-usage-counters-in-hadoop.html).

Also, there might be a reason (performance) why the number of counters are restricted in Hadoop. So, I suggest just to not increase it blindly, but to keep an eye on the performance after the changes.
ReplyDelete
Replies
Anonymous02 December, 2012
Yes, thats a good point. Thanks for sharing this, Praveen :)
ReplyDelete
Replies

Add comment

novatechflow

Search This Blog

Hive query shows ERROR "too many counters"

Labels

Comments

Post a Comment

Popular posts from this blog

Beyond Ctrl+F - Use LLM's For PDF Analysis

Deal with corrupted messages in Apache Kafka

What Makes You The Number 1 Product Manager?