Oracle Hash Partitioning

Oracle Hash Partitioning : Benefits and Best Practices

Partitioning is one of the best features for database administrators to improve both manageability and performance at scale. While most DBAs are familiar with range and list partitioning, both of which align well with typical business needs like handling historical data or categorizing by region or status, there are scenarios where hash partitioning offers a more efficient and scalable solution.

Hash partitioning is particularly useful when you want to ensure an even distribution of data across a predetermined number of partitions or when addressing hot block issues. Hash partitioning evenly distributes rows across a fixed number of partitions using a hash function applied to one or more columns. Hash partitioning can be used for both table partitioning and index partitioning purposes.

One of the best use cases for hash partitioning is when a high number of concurrent inserts leads to excessive index block splits. This often occurs with right-growing indexes where the key is generated from a sequence, especially when the sequence is created with the NOCACHE and ORDER options. This pattern is commonly known as the 90-10 split problem. The resulting contention frequently appears in AWR reports under “Segment Statistics – Segments by Row Lock Waits” for index segments. In such cases, hash partitioning on these indexes can help distribute concurrent access more evenly and reduce contention.

To hash partition an existing index with minimal downtime, you can also take advantage of a feature introduced in Oracle 12c: the ability to create multiple indexes on the same column(s), provided only one is marked as visible. This allows you to build a new (partitioned) index in the background while the original remains active, then switch visibility when ready. You may refer to Connor McDonald’s blog post Partitioning an existing index for a detailed explanation of this approach.

However, it’s important to be aware that Oracle applies a function like ORA_HASH to the partitioning column. If your data distribution isn’t even and is limited to a small scope, it may not provide enough data to fill the hash buckets equally. This can result in skewed data across partitions, leading to uneven distribution. In such cases, choosing the right partitioning key is important. It plays a key role in ensuring that the partitions are evenly distributed, allowing the performance benefits of hash partitioning to be fully realized.

It’s up to you to decide how many partitions to create. The decision should be based on the desired size of each segment you want to manage and store, as well as ease of maintenance, such as during move operations and expected level of hot block concurrency.

Let’s say we have a table that stores data on Turkish football fans, where each record represents a fan and their support for a specific team. The total dataset consists of 1,000,000 records, but 90% of the fans support Beşiktaş, and the remaining 10% are distributed among other Turkish Super League teams.

In this scenario, if you apply hash partitioning based on the team_id column (where each team has a unique ID), Oracle would use a function like ORA_HASH(team_id) to decide how to distribute the records across the partitions. We will not advantage from the hash partitioning.

Insert sample data:

This is the distribution of the data and the number of rows for each partition:

As you can see, the number of Beşiktaş fans continues to grow day by day, every second, new supporters are joining. In such a scenario, you may end up with a highly uneven distribution of table segments, where some partitions (like the one for Beşiktaş – SYS_P4196 ) are heavily accessed with millions of records, while others remain mostly idle. This imbalance can be observed using the v$segment_statistics view by querying logical reads (statistic_name) . As the Beşiktaş partition grows disproportionately, maintenance operations on that segment will become more difficult.

In skewed hash partitioning scenarios, such as when 90% of the data belongs to Beşiktaş fans, certain partitions can become hot spots, leading to significant performance issues. Table Hash partitioning is often used to distribute I/O, especially in manually managed segment space (MSSM) tablespaces with single freelists in older Oracle releases, which are prone to buffer busy wait events. In such cases, AWR reports often list buffer busy waits or similar block-level contention among the top timed events, indicating that concurrency is being impacted at the block level. Additionally, hash partitioning can support partition elimination (e.g., partition-wise joins), improving query performance. These issues shows the importance of selecting an appropriate partitioning key to ensure even data distribution and to minimize contention, both in RAC and single-instance environments.

For partition pruning to occur in hash-partitioned tables, an equality (=) or IN list predicate must be applied directly to the partitioning key. Oracle needs to deterministically compute which partition(s) to access based on known values. Range predicates like BETWEEN, <, >, or even implicit data type conversions will disable pruning, resulting in a full partition scan. To fully benefit from hash partitioning and reduce unnecessary I/O, ensure that queries include explicit equality conditions on all partition key columns. I won’t repeat the content of Mohamed Houri’s blog post Partition Pruning with Hash Partitioned Tables, it’s clearly explained there, and I highly recommend referring to it for further details.

Oracle is capable of transforming IN list predicates into OR conditions and then into UNION ALL operations (via the OR_EXPAND transformation), which makes it possible to still take advantage of partition pruning in certain scenarios.

It may be surprising, but a query like team_id = team_id / team_id * 4 will not trigger partition pruning, whereas a simple team_id = 4 will. This is because Oracle requires the partition key to be used in explicit equality conditions, any runtime expressions or computations on the key can prevent it.

For equality predicates, we often observe the PARTITION HASH SINGLE operation, and for IN list predicates, the PARTITION HASH ITERATOR operation in the execution plan. However, partition pruning is not limited to these operations.

There is a common misconception that partition pruning only works with local indexes or full table scans (FTS) and not with global indexes. In reality, partition pruning can also be effective during TABLE ACCESS BY ROWID operations. Oracle is capable of determining the target partition even from the ROWID dynamically, and it automatically eliminates partitions that do not match the partition key condition. For a detailed explanation and practical examples, I highly recommend reading Sayan Malakshinov’s blog post: Partition Pruning and Global Indexes.

Oracle supports hash partitioning on multiple columns, but here’s the catch:
For partition pruning to work, the query must include all columns used in the hash partition key.
If only one of the columns is used in the WHERE clause, Oracle can’t determine which partition holds the data, and must scan all partitions.

Hash partition count should be a power of 2 for even distribution.
Choose a hash key with high cardinality, evenly distributed values.
Be cautious with multi-column hash partitioning – ensure full and explicit partitioning key usage in queries.
Always verify whether partition pruning is working as expected by using EXPLAIN PLAN with the GATHER_PLAN_STATISTICS hint and FORMAT=ALL or ADVANCED, or at least by enabling the +PARTITION display option.

For a high number of concurrent inserts that lead to excessive index block splits, especially with right-growing indexes built on sequence-generated keys, using a global hash-partitioned index can help mitigate contention and reduce hot spots.

For scenarios where range-based data management is needed alongside concurrency benefits, consider composite range-hash partitioning, which allows you to manage historical data efficiently while benefiting from hash-based distribution within each range partition.

Hope it helps.


Discover More from Osman DİNÇ


Comments

Leave your comment