Oracle RAC Interconnects Test

A cluster database comprises two or more nodes that are linked by an interconnect. The interconnect serves as the communication path between the nodes in the cluster database. Each Oracle instance uses the interconnect for the messaging that synchronizes each instance’s use of shared resources. Oracle also uses the interconnect to transmit data blocks that the multiple instances share.

The non-availability of the interconnect on any cluster node can impair that node’s communication with other nodes in the cluster. As a result, fail-over operations will be hampered and the cluster service will be forced to distribute session/request load across the remaining clusters in the node; this in turn may overload the other nodes in the cluster. In the aftermath of this, mission-critical business services using the clustered resources may experience prolonged outages or slowdowns, resulting in considerable loss of revenue and reputation.

To avoid this, administrators need to continuously monitor the availability of the cluster interconnect on each node, analyze how session/process load is distributed across the nodes via the interconnect, and proactively detect the following:

  • The sudden unavailability of the interconnect on a node;
  • How the unavailability of an interconnect affects the load on the other nodes in the cluster;

For this purpose, you can use the Oracle Cluster Interconnects test. This test periodically verifies whether the nodes in the cluster are able to communicate via the cluster interconnect, and promptly reports the non-availability of the interconnect. In addition, the test also keeps tabs on the session and process load on each node in the cluster, thus promptly revealing the impact of the unavailability of a cluster interconnect on the load and performance of other nodes in the cluster.

Note:

This test is applicable for Oracle Clusters with Multi-tenancy i.e., CDB(Container Database) and PDB (Pluggable Database) configuration.

Target of the test : Oracle RAC

Agent deploying the test : An internal agent

Outputs of the test : One set of results for each clusternodeID_<IP_address_used_for_internode_communication> in the Oracle cluster.

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed

Host

The host for which the test is to be configured.

Port

The port on which the server is listening.

SCAN Name

SCAN stands for Single Client Access Name, it is a feature used in Oracle RAC environments that provide a single name for clients to access any Oracle Database running in the cluster. You can provide SCAN as an alternative to IP/Host Name. If this parameter value is provided, it will be used for connectivity otherwise IP/Hostname will be used.

Service Name

A ServiceName exists for the entire Oracle RAC system. When clients connect to an Oracle cluster using the ServiceName, then the cluster routes the request to any available database instance in the cluster. By default, the Service Name is set to none. In this case, the test connects to the cluster using the ORASID and pulls out the metrics from that database instance which corresponds to that ORASID. If a valid service name is specified instead, then, the test will connect to the cluster using that Service Name, and will be able to pull out metrics from any available database instance in the cluster.

To know the Service Name of a cluster, execute the following query on any node in the target cluster:

select name, value from v$parameter where name =’service_names’

ORASID

The variable name of the oracle instance.

Username

In order to monitor an Oracle database server, a special database user account has to be created in every Oracle database instance that requires monitoring. A Click here hyperlink is available in the test configuration page, using which a new oracle database user can be created. Alternatively, you can manually create the special database user. When doing so, ensure that this user is vested with the select_catalog_role and create session privileges.

The sample script we recommend for user creation (in Oracle database server versions before 12c) for eG monitoring is:

create user oraeg identified by oraeg

create role oratest;

grant create session to oratest;

grant select_catalog_role to oratest;

grant oratest to oraeg;

The sample script we recommend for user creation (in Oracle database server 12c) for eG monitoring is:

alter session set container=<Oracle_service_name>;

create user <user_name>identified by <user_password> container=current default tablespace <name_of_default_tablespace> temporary tablespace <name_of_temporary_tablespace>;

Grant create session to <user_name>;                                

Grant select_catalog_role to <user_name>;

The name of this user has to be specified here.

Password

Specify the password of the specified database user.

Confirm Password

Confirm the Password by retyping it here.

ArchiveFilePath

By default, the eG agent auto-discovers the location of the Oracle archive log file. This is why, the ArchiveFilePath parameter is set to none by default. If required, you can manually specify the path to the Oracle archive log file to be monitored. For eg, /user/john/archive

SSL

By default, this flag is set to No, as the target Oracle cluster is not SSL-enabled by default. If the target cluster is SSL-enabled, then set this flag to Yes.

SSL Cipher

This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none. A cipher suite is a set of cryptographic algorithms that are used before a client application and server exchange information over an SSL/TLS connection. It consist of sets of instructions on how to secure a network through SSL (Secure Sockets Layer) or TLS (Transport Layer Security). In this text box, provide a comma-seperated list of cipher suites that are allowed for SSL/TLS connection to the target cluster. By default, this parameter is set to none.

Truststore File

This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none. TrustStore is used to store certificates from Certified Authorities (CA) that verify and authenticate the certificate presented by the server in an SSL connection. Therefore, the eG agent should have access to the truststore where the certificates are stored to authenticate and connect with the target cluster and collect metrics. For this, first import the certificates into the following default location <eG_INSTALL_DIR>/lib/security/mytruststore.jks. To know how to import the certificate into the truststore, refer toPre-requisites for monitoring Oracle Cluster. Then, provide the truststore file name in this text box. For example: mytruststore.jks. By default, none is specified against this text box.

Truststore Type

This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none.Specify the type of truststore that contains the certificates for server authentication in this text box. For eg.,JKS. By default, this parameter is set to the value none.

Truststore Password

This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none. If a Truststore File name is provided, then, in this text box, provide the password that is used to obtain the associated certificate details from the Truststore File. By default, this parameter is set to none.

Keystore File

This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none.

Keystore contains the private keys for the certificates that the client can provide to the server upon request. eG agent requires access to the keystore where client certificate is stored to send that to the server so that the server validates the certificate against the one contained in its trustore. For this purpose, first create the client certificate in the following default location EG_INSTALL_DIR/jre/lib/security/mykeystore.jks. Then, provide the keystore file name in this text box. For example: mykeystore.jks. By default, none is specified against this text box.

Keystore Password

This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none.

If a Keystore File name or file path is provided, then, in this text box, provide the password that is used to obtain the associated certificate details from the Keystore File.

Confirm Password

Confirm the Password for Keystore by retyping it here.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Cluster interconnect percentage:

Indicates whether the cluster interconnect is available on this node or not.

 

Percent

The value 0 for this measure indicates that this node is unable to communicate with other nodes in the cluster via the cluster interconnect. The value 100 indicates that the interconnect is available and is enabling this node to communicate with the other cluster nodes.

Logon rate

Indicates the rate at which user logons occurred on this node.

Logons/Sec

 

Processes running

Indicates the number of processes currently running on this cluster node.

Number

As long as the value of this measure is much lower than the value of the processes setting in the database parameter file, the node will be able to handle the process load.

Process utilization

Of the maximum number of processes this node can handle, what percentage is currently active on this cluster node. 

 

Percent

Ideally, the value of this measure should be low. If this measure value is close to 100%, it could mean that the node is about to exhaust its processing limit and may not be able to handle any more processes. On the other hand, if the value of this measure is consistently high for a cluster node, then check the processes setting in the database parameter file to figure out whether/not the node has been configured with adequate processing capability. If this check reveals that the node has been configured with a limited number of processes than it can handle, you may want to increase the processes setting to suit the node’s capacity.

Sessions used

Indicates the number of sessions that are currently active on this node.

Number

As long as the value of this measure is much lower than the value of the sessions setting in the database parameter file, the node will be able to  handle the session load. If the value of this measure is unusually high for any cluster node, then compare the value of this measure across nodes to figure out whether/not load is uniformly distributed across all cluster nodes. If session load on most of the cluster nodes is high, then the sudden increase in session load could be attributed to an unavailable cluster interconnect. Because of the unavailability, the cluster service may not have been unable to contact the affected cluster node and may have been compelled to distribute the load amongst the remaining cluster nodes. This may have caused load on the other nodes to suddenly increase. To confirm this, check the value of the Interconnect availability percentage measure of all nodes.

On the other hand, if no interconnect is unavailable, and if Session utilization is abnormally high on a particular node only, it could mean that that node is indeed overloaded.

Session utilization

Of the maximum number of sessions this node can handle, what percentage is currently active on this cluster node. 

 

 

Percent

Ideally, the value of this measure should be low. If this measure value is close to 100%, it could mean that the node may not be able to handle any more sessions. On the other hand, if the value of this measure is consistently high for a cluster node, then check the sessions setting in the database parameter file to figure out whether/not the node has been configured with adequate session-handling capability. If this check reveals that the node has been configured with a limited number of sessions than it can handle, you may want to increase the sessions setting to suit the node’s true capacity.