Oracle RAC Latches Test

Latches are mechanisms for protecting and managing SGA data structures and database objects being accessed concurrently. Unlike locks, latches provide exclusive access to protected data structures. Requests for latches are not queued. So, if a request fails, the requesting process may try later. Typically, latches are used to protect resources that are briefly needed.

An Oracle process can request a latch in one of the following two modes:

Willing-to-Wait Mode: If the requested latch is not immediately available, the process will wait. When an attempt to get a latch in a willing-to-wait mode fails, the process will spin and try again. If the number of attempts reaches the value of the SPIN_COUNT parameter, the process sleeps. Sleeping is more expensive than spinning.
Immediate Mode (no-wait mode): In this case, the process will not wait if the requested latch is not available and it continues its processing.

Latch contention has a significant impact on performance when:

Enough latches are not available
A latch is held for a relatively long time

Latch mechanisms most likely to suffer from contention involve requests to write data into the redo log buffer. To serve the intended purpose, writes to the redo log buffer must be serialized. There are four different groupings applicable to redo buffer latches: redo allocation latches and redo copy latches, each with immediate and willing-to-wait priorities.

The Oracle RAC Latches test is used to monitor latches in the shared cluster storage.

Target of the test : Oracle Cluster

Agent deploying the test : An internal agent

Outputs of the test : One set of results for every node in the Oracle cluster monitored

Configurable parameters for the test
TEST PERIOD - How often should the test be executed Host – The host for which the test is to be configured Port - The port on which the server is listening SCAN Name - SCAN stands for Single Client Access Name, it is a feature used in Oracle RAC environments that provide a single name for clients to access any Oracle Database running in the cluster. You can provide SCAN as an alternative to IP/Host Name. If this parameter value is provided, it will be used for connectivity otherwise IP/Hostname will be used. orasid - The variable name of the oracle instance service name - A ServiceName exists for the entire Oracle RAC system. When clients connect to an Oracle cluster using the ServiceName, then the cluster routes the request to any available database instance in the cluster. By default, the service name is set to none. In this case, the test connects to the cluster using the orasid and pulls out the metrics from that database instance which corresponds to that orasid. If a valid service name is specified instead, then, the test will connect to the cluster using that service name, and will be able to pull out metrics from any available database instance in the cluster. To know the ServiceName of a cluster, execute the following query on any node in the target cluster: select name, value from v$parameter where name =’service_names’ User – In order to monitor an Oracle RAC, a special database user account has to be User – In order to monitor an Oracle database server, a special database user account has to be created in every Oracle database instance that requires monitoring. A Click here hyperlink is available in the test configuration page, using which a new oracle database user can be created. Alternatively, you can manually create the special database user. When doing so, ensure that this user is vested with the select_catalog_role and create session privileges. The sample script we recommend for user creation (in Oracle database server versions before 12c) for eG monitoring is: create user oraeg identified by oraeg create role oratest; grant create session to oratest; grant select_catalog_role to oratest; grant oratest to oraeg; The sample script we recommend for user creation (in Oracle database server 12c) for eG monitoring is: alter session set container=<Oracle_service_name>; create user <user_name>identified by <user_password> container=current default tablespace <name_of_default_tablespace> temporary tablespace <name_of_temporary_tablespace>; Grant create session to <user_name>; Grant select_catalog_role to <user_name>; The name of this user has to be specified here. Password – Password of the specified database user Confirm password – Confirm the password by retyping it here. SSL- By default, this flag is set to No, as the target Oracle cluster is not SSL-enabled by default. If the target cluster is SSL-enabled, then set this flag to Yes. SSL Cipher-This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none. A cipher suite is a set of cryptographic algorithms that are used before a client application and server exchange information over an SSL/TLS connection. It consist of sets of instructions on how to secure a network through SSL (Secure Sockets Layer) or TLS (Transport Layer Security). In this text box, provide a comma-seperated list of cipher suites that are allowed for SSL/TLS connection to the target cluster. By default, this parameter is set to none. TRUSTSTORE FILE- This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none. TrustStore is used to store certificates from Certified Authorities (CA) that verify and authenticate the certificate presented by the server in an SSL connection. Therefore, the eG agent should have access to the truststore where the certificates are stored to authenticate and connect with the target cluster and collect metrics. For this, first import the certificates into the following default location <eG_INSTALL_DIR>/lib/security/mytruststore.jks. To know how to import the certificate into the truststore, refer toPre-requisites for monitoring Oracle Cluster. Then, provide the truststore file name in this text box. For example: mytruststore.jks. By default, none is specified against this text box. TRUSTSTORE TYPE-This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none.Specify the type of truststore that contains the certificates for server authentication in this text box. For eg.,JKS. By default, this parameter is set to the value none. TRUSSTORE PASSWORD-This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none. If a Truststore File name is provided, then, in this text box, provide the password that is used to obtain the associated certificate details from the Truststore File. By default, this parameter is set to none.

Measurements made by the test
Measurement	Description	Measurement Unit	Interpretation
Willing-to-wait misses:	This measures the latch contention for requests that were willing to wait to acquire a latch. The value of this metric represents the ratio of the number of requests that could not acquire a latch, to those that could acquire a latch.	Percent	Both the above metrics should be 1% or less. For redo allocation latches, if the Willing_to_wait_misses is high, consider decreasing the LOG_SMALL_ENTRY_MAX_SIZE parameter in the INIT.ORA file. By making the max size for a redo allocation latch smaller, more redo log buffer writes qualify for a redo copy latch instead, thus better utilizing multiple CPU’s for the redo log buffer writes. Even though memory structure manipulation times are measured in nanoseconds, a larger write still takes longer than a smaller write. If the size for remaining writes done via redo allocation latches is small enough, they can be completed with little or no redo allocation latch contention. On a single CPU node, all log buffer writes are done via redo allocation latches. If log buffer latches are a significant bottleneck, performance can benefit from additional CPU’s (thus enabling redo copy latches) even if the CPU utilization is not an operating system level bottleneck. If the values for redo copy latches is > 1%, consider increasing the LOG_SIMULTANEOUS_COPIES parameter in the INIT.ORA file. This initialization parameter is the number of redo copy latches available. It defaults to the number of CPU’s (assuming a multiple CPU node). Oracle recommends setting it as large as 2 times the number of CPU’s on the particular node, although quite a bit of experimentation may be required to get the value adjusted in a suitable manner for any particular instance’s workload. Depending on CPU capability and utilization, it may be beneficial to set this initialization parameter smaller or larger than 2 X #CPU’s. Note that the LOG_SIMULTANEOUS_COPIES parameter obsolete from Oracle 8i onwards. Hence, if you are monitoring Oracle 8i (or higher), use the hidden parameter _LOG_SIMULTAENOUS_COPIES instead. Recall that the assignment of log buffer writes to either redo allocation latches or redo copy latches is controlled by the maximum log buffer write size allowed for a redo allocation latch, and is specified in the LOG_SMALL_ENTRY_MAX_SIZE initialization parameter. Recall also that redo copy latches apply only to multiple CPU hosts. Note that the LOG_SMALL_ENTRY_MAX-SIZE parameter is not supported from Oracle 9i onwards.
Immediate misses:	This metric measures the latch contention for requests that were not willing to wait to acquire a latch. The value of this metric represents the percentage of “not willing to wait” latch requests that failed. In other words: the number of “not willing to wait” request misses / the total number of “not willing to wait” requests	Percent