How to Locate an HBase Region for a Row Key and Trigger a Major Compaction

This guide explains how to inspect a row key, find the region responsible for that row and perform a targeted major compaction using modern HBase shell commands. Region-level compaction is useful for maintenance, skewed regions and cleanup of deleted data, but should be used carefully due to its I/O impact.

Inspecting a Row Key

To view a sample of rows from a table:

scan 'your_table', { LIMIT => 5 }

To inspect a specific row key:

get 'your_table', "\x00\x01"

Locate the Region for a Specific Row

Modern HBase shells allow you to query region boundaries directly.

locate_region 'your_table', "\x00\x01"

This returns the region name, start key, end key and hosting RegionServer. You can also list all regions for the table:

get_regions 'your_table'

Triggering a Major Compaction on a Region

Once you know the region name (e.g. your_table,,1712087434000.abc123), you can run:

major_compact 'your_table,,1712087434000.abc123'

Or to compact the entire table (more expensive):

major_compact 'your_table'

Operational Notes

Major compaction is expensive: it rewrites store files, clears deleted and expired cells and can increase I/O load.
Region-level compaction is safer than table-wide compaction for large datasets.
Never automate major compaction blindly; use it for targeted cleanup or operational debugging.
Region names can be found via locate_region, status 'simple' or the HBase Master UI.

These commands use the modern HBase shell syntax and avoid deprecated Java APIs such as HTable and HBaseAdmin, which no longer work in recent releases.

Related guides:

If you need help with distributed systems, backend engineering, or data platforms, check my Services.

novatechflow | Alexander Alten

Search This Blog