Resolving Oracle RAC Node Eviction Issue caused by Cluster Interconnect

13 Mar 2025 15:55

—

Introduction:

In a traditional Oracle Real Application Clusters (RAC) setup using physical servers, we encountered a perplexing issue with node evictions (the LMON process crashed the instance with an error like ORA-481, ORA-29743) and database restarts due to a loss of connection to the backbone network via one of the private interconnects. Despite having redundant private network interfaces, two different dedicated subnets, and multiple redundant ASM.NETWORK resources on CRS, the nodes were still evicted whenever the connection failed on one interface. After analyzing the situation, we determined that the issue stemmed from how we had configured the cluster_interconnects parameter. This post will detail the cause of the issue, the troubleshooting steps we took, and the solution we applied.

The Setup: We had a two-node physical server-based RAC with multiple redundant private network interfaces (ens5f0 and ens5f1) configured on each node, using dedicated subnets (192.168.13.0/24 and 192.168.14.0/24) to ensure high availability for cluster_interconnect. There were two ASM network resources on CRS: ora.asmnet1.asmnetwork and ora.asmnet2.asmnetwork. However, when one of the private interconnects lost connection to the backbone, the nodes were evicted and rebooted, despite the HAIP (Highly Available IP) failing over to the other interface, which was serving on the other ASM.NETWORK.

Private Interconnect Network of Current RAC

What does Oracle recommend for private interconnect configuration?

Private interface requirements depend on whether you are using a single or multiple interfaces. For a single-interface setup, one private interface is needed on each node for the interconnects, and they must communicate within the same subnet. For example, if the private interfaces are configured with a subnet mask of 255.255.255.0, your private network range will be 192.168.13.0–192.168.13.255, and the IP addresses must fall within the range 192.168.13.[0-255]. Both IPv4 and IPv6 addresses can be used.

For Redundant (Multiple Interfaces) Interconnect Usage, you can specify multiple interfaces for the cluster’s private network without the need to use bonding or other technologies. Oracle recommends not using bonding, for Multiple private interconnect Interfaces. This information is detailed in the “Highly Available IP (HAIP) FAQ for Release 11.2 (Doc ID 1664291.1).” Oracle’s best practice is to use unbonded, unteamed NICs without any additional layers of virtualization.

When you define multiple interfaces, Oracle Clusterware creates one to four highly available IP (HAIP) addresses. Oracle RAC and Oracle Automatic Storage Management (ASM) instances use these interface addresses to ensure highly available, load-balanced communication between nodes. In our case, we had two interfaces for the private interconnect on each node. By default, Oracle Grid Infrastructure software uses all HAIP addresses for private network communication, providing load balancing across the set of interfaces you identify for the private network. If a private interconnect interface fails or becomes non-communicative, Oracle Clusterware transparently moves the corresponding HAIP address to one of the remaining functional interfaces.

Each private interface should be on a different subnet. Each cluster node must have an interface on each private interconnect subnet, and these subnets must be reachable by all nodes in the cluster. For example, you can use private networks on subnets 192.168.13.0/24 and 192.168.14.0/24, but each cluster node must have an interface connected to both the 192.168.13.0/24 and 192.168.14.0/24 subnets. Endpoints of all designated interconnect interfaces must be fully reachable across the network. No node should be disconnected from any private network interface. You can test if an interconnect interface is reachable by using the ping command.

How did we troubleshoot the issue?

In our case, there was a planned blackout period for maintenance on one of the backbone devices. When the downtime of backbone device began, the databases were terminated and restarted by CRS. The databases were terminated due to a loss of connection to the storage (ASM instances). Although Flex ASM was in use, the ASM instances were also terminated. They entered a starting state but did not open successfully.

Since the downtime was brief and the database servers were standby databases, business operations were not affected. However, we began investigating the root cause of the issue meticulously. Here are the usual suspects.

Cross-connect issue ? (Incorrect cabling)

Our first step was to check whether the private network interfaces, which were meant to communicate with each other, were connected to different switches. This could have caused communication failures between the nodes. We suspected a cabling issue where the interfaces were not properly paired. The expected behavior was that the communicating interfaces should be connected to the same backbone switches. If the interfaces were connected to different backbone switches , it could result in both nodes losing one interface and being unable to communicate with each other.

We verified the cabling and switch configurations to ensure that both network interfaces on each node were indeed connected to separate switches. This would ensure high availability and redundancy in case one switch failed.

We examined the Flex ASM and CRS logs to verify if there were any additional errors or anomalies that could point to why the ASM instances and databases were terminated. This included checking for network-related errors, configuration issues, or any timeouts related to the private interconnect.

In the /var/log/messages, the issue was clearly stated as follows:

...
Feb 5 08:45:18 blt01 kernel: igb 0000:18:00.0 ens5f0: igb: ens5f0 NIC Link is Down

In the alert.log file for the ASM instance, the disconnected interface was recorded as follows:

...
2025-02-21T08:45:28.837768+03:00
SKGXP: ospid 2313928: network interface with IP address 192.168.13.1 no longer running (check cable)

In the CRS log: (/u01/app/oracle/diag/crs/blt01/crs/trace/alert.log)

...
2025-02-21 08:35:20.365 [GIPCD(2310702)] CRS-42216: No interfaces are configured on the local node for interface definition ens5f0(:.*)?:192.168.13.0: available interface definitions are ...
2025-02-21 08:45:21.370 [ORAROOTAGENT(2310363)]CRS-5050: HAIP failover due to network interface ens5f0 not functioning
...
2025-02-21 08:45:50.850 [ORAROOTAGENT(2315182)]CRS-5017: The resource action "ora.asmnet1.asmnetwork start" encountered the following error:
2025-02-21 08:45:50.850+CRS-5006: Unable to automatically select a network interface which has subnet mask and subnet number 192.168.13.0
...

orarootagent log: (/u01/app/oracle/diag/crs/blt01/crs/trace/crsd_orarootagent_root.trc)

...
2025-02-21 08:45:19.665 : CLSDYNAM:3474949888: [ora.asmnet1.asmnetwork]{1:54476:2} [check] sIsIfRunnng ens5f0 is not RUNNING : 1003
2025-02-21 08:45:19.665 : CLSDYNAM:3474949888: [ora.asmnet1.asmnetwork]{1:54476:2} [check] sIsIfRunnng ens5f0:1 is not RUNNING : 1003
2025-02-21 08:45:20.571 : CLSDYNAM:3474949888: [ora.asmnet1.asmnetwork]{1:54476:2} [check] AsmNetworkAgent::checkInternal Ignoring Exception in init(actx)
2025-02-21 08:45:20.571 : CLSDYNAM:3474949888: [ora.asmnet1.asmnetwork]{1:54476:2} [check] got exception from NetInterface::checkNetInterface. if=,addrType=IPV4
2025-02-21 08:45:20.571 : CLSDYNAM:3474949888: [ora.asmnet1.asmnetwork]{1:54476:2} [check] AsmNetworkAgent::checkInternal ioctl Error
2025-02-21 08:45:20.571 : CLSDYNAM:3474949888: [ora.asmnet1.asmnetwork]{1:54476:2} [check] (null) category: -1, operation: failed system call, loc: ioctl, OS error: 19, other:

HAIP feature ?

Up until now, we have observed from the logs that connectivity was lost on one interface. The floating HAIP successfully failed over to the other interface. At this point, network traffic should continue through the remaining interface, which is the intended function of HAIP.

The HAIP associated with the failed NIC will failover and be assigned to one of the remaining working physical NICs designated for the cluster interconnect. As a result, the HAIP will now reside on a different subnet, one that is operational across all nodes.

An ARP broadcast will be sent using the new MAC address for the NIC to which the HAIP has been attached. Since both ASM and RDBMS instances use the HAIP, they should transparently continue functioning through the NIC that the HAIP has failed over to.

An important point to note is that the HAIP will perform the same failover on every active cluster node. This ensures that the HAIP is attached to a NIC on the same subnet across all nodes, providing consistent communication within the cluster.

Although the related HAIP resource (ora.cluster_interconnect.haip) was in the ONLINE state in CRS, we still experienced a node eviction. I searched MOS for any reported bugs or additional information that could be helpful. I found Bug 36118795 – “DATABASE CRASH DURING ONE OF THE CLUSTER INTERCONNECT MAINTENANCE.” In that case, the net1 interface was down, and HAIP did not move the IP from the non-functioning net1 interface to the functioning net0 interface. However, in our case, HAIP (169.254.10.239) successfully failed over.

To better diagnose the issue, I tried to understand the behavior of HAIP and read more about the subject. The following documents were useful:

Highly Available IP (HAIP) FAQ for Release 11.2 (Doc ID 1664291 .1)
How to Modify Private Network Information in Oracle Clusterware (Doc ID 283684.1)
Cluster Interconnect in Oracle RAC (Doc ID 787420.1)

After reading these documents, I queried the v$cluster_interconnect view and observed that the HAIP addresses were not being used for interconnect communication.

Here is the output.

SYS@bltdb1> select * from gv$cluster_interconnects;
INST_ID NAME      IP_ADDRESS     CON_ID  IS_PUBLIC    SOURCE 
------- --------  -------------- ------  ---------    -------------------------------
1       ens5f0    192.168.13.1   0       NO           cluster_interconnects parameter
1       ens5f1    192.168.14.1   0       NO           cluster_interconnects parameter
2       ens5f0    192.168.13.2   0       NO           cluster_interconnects parameter
2       ens5f1    192.168.14.2   0       NO           cluster_interconnects parameter

CLUSTER_INTERCONNECTS parameter ?

According to the output, the cluster_interconnect IP addresses were sourced from the cluster_interconnects parameter. This parameter was set as follows, reflecting an outdated best practice that the organization had followed for years, which was based on EXAchk recommendations. In all rdbms and asm databases it was set as follows:

SYS@bltdb1> ALTER SYSTEM SET cluster_interconnects='192.168.13.1:192.168.14.1'  sid='bltdb1' ;
SYS@bltdb2> ALTER SYSTEM SET cluster_interconnects='192.168.13.2:192.168.14.2'  sid='bltdb2' ;

Here is the explanation of the related parameter from the Database Reference Oracle 19c:

Use this parameter to override the default interconnect configured for the database traffic, which is stored in the cluster registry. This procedure also may be useful with Data Warehouse systems that have reduced availability requirements and high interconnect bandwidth demands.

CLUSTER_INTERCONNECTS specifically overrides the following:

Network classifications stored by oifcfg in the OCR.
The default interconnect chosen by Oracle.

If you want to load-balance the interconnect, then Oracle recommends that you use link-bonding at the operating system level, even if you have two databases on the same server, so that multiple interconnects use the same address. Note that multiple private addresses provide load balancing, but do not provide failover unless bonded. If you specify multiple addresses in init.ora using CLUSTER_INTERCONNECTS, instead of bonding multiple addresses at the operating system level, then typically availability is reduced, because each network interface card failure will take down that instance.

Refer to your vendor documentation for information about bonding interfaces. Some vendor bonding architectures may require the use of this parameter.

This parameter takes precedence over the value set in the OCR. You can define multiple IP addresses in CLUSTER_INTERCONNECTS, which causes all specified interfaces to be used. If one of these interfaces fails, the associated instance will fail as well. The instance can only be restarted once all the interfaces are restored or if all instances are started without the faulty interface. This was the case in our situation. We were overriding Oracle’s default interconnect choice and forcing the use of static private interconnect interfaces, rather than the HAIPs.

Additionally, the cluster interconnect communication configuration is recorded in the alert.log file during instance startup.

...
Cluster Communication is configured to use IPs from: CI
IP: 192.168.13.1    Subnet: 192.168.13.0
IP: 192.168.14.1    Subnet: 192.168.14.0
KSIPC Available Transports for cluster communication: UDP:TCP:RDSTCPBS
KSIPC: Client: KCL  Transport: UDP
KSIPC: Client: DLM  Transport: UDP
KSXP: ksxpsg_ipclwtrans: 2 UDP
cluster interconnect IPC version: [IPCLW over UDP(mode 3)  ]
...

After gathering that information, the solution we applied was simply: STICKING TO THE DEFAULTS.

SYS@+ASM1> ALTER SYSTEM RESET cluster_interconnects  sid='+ASM1' scope=SPFILE;
SYS@+ASM2> ALTER SYSTEM RESET cluster_interconnects  sid='+ASM2' scope=SPFILE;
SYS@bltdb1> ALTER SYSTEM RESET cluster_interconnects sid='bltdb1' scope=SPFILE;
SYS@bltdb2> ALTER SYSTEM RESET cluster_interconnects sid='bltdb2' scope=SPFILE;

Now, HAIP will be used for cluster interconnect communication. Since these parameters are not dynamic, we reset them on spfile and then restarted CRS on each node. After the instance was restarted, the output of v$cluster_interconnects was as follows:

SYS@bltdb1> select * from gv$cluster_interconnects;
INST_ID NAME      IP_ADDRESS         CON_ID  IS_PUBLIC    SOURCE 
------- --------  --------------     ------  ---------    -------------------------------
1       ens5f0:1    169.254.10.239   0       NO           
1       ens5f1:1    169.254.31.210   0       NO           
2       ens5f0:1    169.254.1.196    0       NO           
2       ens5f1:1    169.254.27.156   0       NO

Additionally, the cluster interconnect communication configuration recorded in the alert.log file during instance startup has changed. Here it is:

...
Cluster Communication is configured to use IPs from: GPnP
IP:  169.254.10.239    Subnet: 169.254.0.0
IP:  169.254.31.210    Subnet: 169.254.16.0
KSIPC Available Transports for cluster communication: UDP:TCP
KSIPC: Client: KCL  Transport: UDP
KSIPC: Client: DLM  Transport: UDP
KSXP: ksxpsg_ipclwtrans: 2 UDP
cluster interconnect IPC version: [IPCLW over UDP(mode 3)  ]
...

We can clearly see that the IP addresses used for cluster communication have changed to HAIP, and these IP addresses are now determined by the GPnP (Grid Plug and Play) profile ($GRID_HOME/gpnp/hostname/profiles/peer/profile.xml). Previously, they were defined by the CI (cluster_interconnects) parameter.

What about the EXADATA environments?

Actually our old habit of setting CLUSTER_INTERCONNECTS parameter was based on the exachk recommendations. ( Exachk – Database parameter CLUSTER_INTERCONNECTS is not set to the recommended value (Doc ID 2353240.1) )

exachk id=AA53138B0B7D3B86E040E50A1EC07003

The private network (either InfiniBand or RoCE) in an Oracle Exadata Database Machine provides superior performance and throughput characteristics that allow Oracle Clusterware, ASM, and databases to operate at optimal efficiency. The private interconnect used by ASM instances, available in the v$cluster_interconnects view, should match both the ip addresses configured on the private network interfaces.

HAIP is not supported on Exadata, and by default, it is not enabled. However, in some cases, particularly when Grid Infrastructure is installed outside of OEDA, HAIP could be inadvertently enabled (if manually installed outside of OEDA prior to 12.1.2.4). Setting the cluster_interconnects parameter explicitly ensures the use of InfiniBand interfaces (ib0 and ib1) if HAIP is accidentally enabled on Exadata (Bug 23627471 and Bug 24420871).

If HAIP is not enabled in your Exadata environments, there is no need to set cluster_interconnects parameter, as the GPnP profile will already use the private interconnect IP addresses. However, if HAIP is enabled in your Exadata environments, refer to HOWTO: Remove/Disable HAIP on Exadata (Doc ID 2524069.1) for guidance.

On Oracle Exadata X4-2 and later hardware, starting with Oracle Exadata System Software release 11.2.3.3.0, InfiniBand bonding was changed from BONDIB0 to IB0 and IB1, with no bonding for the interfaces. In an Exadata environment with no bonded private interconnect interfaces, the cluster communication configuration is recorded in the alert.log as follows:

...
Cluster Communication is configured to use IPs from: GPnP
IP: 192.168.11.1    Subnet: 192.168.8.0
IP: 192.168.11.2    Subnet: 192.168.8.0
..
KSIPC Available Transports for cluster communication: RC:RDS:UDP:TCP:XRC:TCP:UD_RDS
KSIPC: Client: KCL  Transport: XRC
KSIPC: Client: DLM  Transport: RDS
KSXP: ksxpsg_ipclwtrans: 2 UDP
cluster interconnect IPC version: [IPCLW over RDS(mode 2)  ]
...

For traditional machines (non-Oracle engineered), the UDP protocol is used for cluster communication, whereas for Exadata, RDS is used. IPCLW is new IPC light weight implementation that is used in 12.2 database.

The question is: although no bonding and HAIP are used in Exadata environments, when an InfiniBand switch is powered off and the related interface goes down at the OS level, how does cluster communication continue without downtime? If HAIP were used, it would fail over to the other interface. The way Exadata handles this situation is explained by the RDMA modules. RDMAIP is a feature for RDMA connections in Oracle Linux. When this feature, also known as active-active bonding, is enabled, the Resilient RDMAIP module creates an active bonding group among the ports of an adapter. If a network adapter is lost, the IPs on that port are moved to another port automatically, providing HA for the application while utilizing the full available bandwidth in non-failure scenarios.

In simple terms, even though there is no bonding configuration under /etc/sysconfig/network-scripts/, RDMAIP itself manages this process. During maintenance, such as when one interface goes down (for example, during an InfiniBand switch upgrade), the IP address of the downed port fails over to the other port, similar to HAIP. Below is a sample output from /var/log/messages when an InfiniBand switch is powered off.

Oct 14 11:52:39 exabsmdb01 systemd-networkd: ib0 : lost carrier
Oct 14 11:52:39 exabsmdb01 kernel: rdmaip: NET-EVENT: NETDEV_DOWN, PORT m1x4_0/port_1/ib0 : port state transition to DOWN (portlayers 0x8)
Oct 14 11:52:39 exabsmdb01 kernel: rdmaip: IPv4 192.168.11.1 migrated from ib0 (port 1) to ib1:P01 (port 2)

After resetting the CLUSTER_INTERCONNECTS parameter on the ASM and RDBMS instances of Exadata, I ran the EXAchk check AA53138B0B7D3B86E040E50A1EC07003.

[root@exadb01 ~]# exachk -check AA53138B0B7D3B86E040E50A1EC07003

Status: ASM parameter CLUSTER_INTERCONNECTS is set to recommended value.

Cluster Interconnect Value from instance:
192.168.11.1 192.168.11.2

Network card name and IP address from oifcfg
ib0 = 192.168.11.1
ib1 = 192.168.11.2

Resolving Oracle RAC Node Eviction Issue caused by Cluster Interconnect

Introduction:

What does Oracle recommend for private interconnect configuration?

How did we troubleshoot the issue?

Cross-connect issue ? (Incorrect cabling)

HAIP feature ?

CLUSTER_INTERCONNECTS parameter ?

What about the EXADATA environments?

Final words: Closeness to defaults wins every time.

Discover More from Osman DİNÇ

Comments

Leave your comment Cancel reply

Resolving Oracle RAC Node Eviction Issue caused by Cluster Interconnect

Introduction:

What does Oracle recommend for private interconnect configuration?

How did we troubleshoot the issue?

Cross-connect issue ? (Incorrect cabling)

HAIP feature ?

CLUSTER_INTERCONNECTS parameter ?

What about the EXADATA environments?

Final words: Closeness to defaults wins every time.

Share this:

Discover More from Osman DİNÇ

Comments

Leave your comment Cancel reply