hard validation during asm relocation

New Exachk Criterion: Hardware Assisted Resilient Data (HARD) Validation During ASM Relocation

Recently, while analyzing an Exachk report generated using AHF version 25.4.0, I came across a new check that I wasn’t previously aware of: Verify diskgroup attributes are set to enable Hardware Assisted Resilient Data  capability.

I’m not sure when exactly it was introduced, perhaps it’s been there for a while and I’ve only just noticed it. Either way, it caught my attention and seems worth a deeper look.

It is documented in High Availability Overview and Best Practices – Oracle Database 23ai :

The Exadata Hardware Assisted Resilient Data (HARD) checks include support for server parameter files, control files, log files, Oracle data files, and Oracle Data Guard broker files, when those files are stored in Exadata storage. This intelligent Exadata storage validation stops corrupted data from being written to disk when a HARD check fails, which eliminates a large class of failures that the database industry had previously been unable to prevent.

Examples of the Exadata HARD checks include:

  • Redo and block checksum
  • Correct log sequence
  • Block type validation
  • Block number validation
  • Oracle data structures, such as block magic number, block size, sequence number, and block header and tail data structures

Exadata HARD checks are initiated from Exadata storage software (cell services) and work transparently after enabling a database DB_BLOCK_CHECKSUM parameter, which is enabled by default in the cloud. Exadata is the only platform that currently supports the HARD initiative.

Furthermore, Oracle Exadata Storage Server provides non-intrusive, automatic hard disk scrub and repair. This feature periodically inspects and repairs hard disks during idle time. If bad sectors are detected on a hard disk, then Oracle Exadata Storage Server automatically sends a request to Oracle Automatic Storage Management (ASM) to repair the bad sectors by reading the data from another mirror copy.

Finally, Exadata and Oracle ASM can detect corruptions as data blocks are read into the buffer cache, and automatically repair data corruption with a good copy of the data block on a subsequent database write. This inherent intelligent data protection makes Exadata Database Machine and ExaDB-D the best data protection storage platform for Oracle databases.

In addition to this, it is possible the detection of a extent mismatch between the partners during a rebalance operation. When a valid copy is found on any of the mirrors, it is used to repair the invalid copy, keeping the consistency of the data by setting ASM diskgroup attributes :

The correct diskgroup attribute values vary by software version level. Exachk runs the appropriate checks based upon the discovered environment configuration.

For grid version series 12 or 18:
Attribute Name Value
content.check FALSE
hard_check.enabled TRUE

For grid version 19 or higher:
Attribute Name Value
content.check TRUE
hard_check.enabled TRUE

This attribute enables or disables content checking when performing data relocation operations for a disk group. The attribute value can be set to true (enabled) or false (disabled).

When CONTENT.CHECK is enabled, an Oracle ASM relocation process (rebalance, resync, or resilver) detects corruptions during a data copy operation and performs automatic block corruption recovery by replacing these corruptions with an uncorrupted mirror copy if one is available.

The content check process detects and repairs corruptions for situations when the I/O operation is successful, but the blocks have invalid content. The process also performs a Hardware Assisted Resilient Data (HARD) check for all supported files and a block header check for data files.

This attribute enables or disables Hardware Assisted Resilient Data (HARD) checking when performing data copy operations for rebalancing a disk group.

The attribute value can be set to true or false. This attribute can only be set when altering a disk group.

When the CONTENT.CHECK attribute is set to disabled (false) and the CONTENT_HARDCHECK.ENABLED attribute is set to disabled (false), no checking is performed.

When the CONTENT.CHECK disk group attribute is set to enabled (true), the setting of CONTENT_HARDCHECK.ENABLED is ignored and checking is done on the content of user data, including HARD checks.

When the CONTENT.CHECK attribute is set to disabled (false) and the CONTENT_HARDCHECK.ENABLED attribute is set to enabled (true), only HARD checking is performed.

You may use the following query to list disk group attributes related to HARD capability:

Since our environment is running Oracle Grid Infrastructure version 19, I will proceed by setting the content.check attribute of the ASM diskgroups to TRUE.

There is no need to explicitly set the content_hardcheck.enabled parameter, because: When the content.check disk group attribute is set to TRUE, the value of content_hardcheck.enabled is ignored. In this case, Oracle performs full content verification, which includes Hardware Assisted Resilient Data (HARD) checks as part of the process.

Additionally, this check has been also included in the Oracle Exadata Database Machine Setup/Configuration Best Practices (Doc ID 1274318.1) document on My Oracle Support.

Hope it helps.


Discover More from Osman DİNÇ


Comments

Leave your comment