Oracle RAC User Waits Test

When Oracle executes an SQL statement, it is not constantly executing. Sometimes it has to wait for a specific event to happen before it can proceed. For example, if Oracle (or the SQL statement) wants to modify data, and the corresponding database block is not currently in the SGA, Oracle waits for this block to be available for modification. Every such wait event belongs to a class of wait events. The following list describes each of the wait classes.

Wait Class Description

Administrative

Waits resulting from DBA commands that cause users to wait (for example, an index rebuild)

Application

Waits resulting from user application code (for example, lock waits caused by row level locking or explicit lock commands)

Cluster

Waits related to Real Application Cluster resources (for example, global cache resources such as ‘gc cr block busy’

Commit

This wait class only comprises one wait event - wait for redo log write confirmation after a commit (that is, ‘log file sync’)

Concurrency

Waits for internal database resources (for example, latches)

Configuration

Waits caused by inadequate configuration of database or instance resources (for example, undersized log file sizes, shared pool size)

Idle

Waits that signify the session is inactive, waiting for work (for example, ‘SQL*Net message from client’)

Network

Waits related to network messaging (for example, ‘SQL*Net more data to dblink’)

Other

Waits which should not typically occur on a system (for example, ‘wait for EMON to spawn’)

Scheduler

Resource Manager related waits (for example, ‘resmgr: become active’)

System I/O

Waits for background process IO (for example, DBWR wait for ‘db file parallel write’)

User I/O

Waits for user IO (for example ‘db file sequential read’)

Since wait events are resource-drains and serious performance degraders, administrators need to keep a close eye on these wait classes, figure out how much time the Oracle cluster actually spends waiting for each class, and rapidly decipher why, so that measures can be initiated to minimize these events. To achieve this, you can use the Oracle RAC User Waits test. This test reports the time spent by the nodes in the cluster waiting for events of each wait class, helps identify those wait classes with wait events that have remained active for a long time, and also reveals the number of sessions that have been impacted by the waiting. With the help of the detailed diagnostics of this test, you can also zoom into these sessions and identify the queries that they executed that may have caused wait events to occur; this way, inefficient queries can be isolated.

Target of the test : Oracle Cluster

Agent deploying the test : An internal agent

Outputs of the test : One set of results for each wait class active on the cluster nodes of the Oracle cluster being monitored

Configurable parameters for the test
  1. TEST PERIOD - How often should the test be executed
  2. Host – The host for which the test is to be configured
  3. Port - The port on which the server is listening
  4. orasid - The variable name of the oracle instance
  5. service name - A ServiceName exists for the entire Oracle RAC system. When clients connect to an Oracle cluster using the ServiceName, then the cluster routes the request to any available database instance in the cluster. By default, the service name is set to none. In this case, the test connects to the cluster using the orasid and pulls out the metrics from that database instance which corresponds to that orasid. If a valid service name is specified instead, then, the test will connect to the cluster using that service name, and will be able to pull out metrics from any available database instance in the cluster.

    To know the ServiceName of a cluster, execute the following query on any node in the target cluster:

    select name, value from v$parameter where name =’service_names’

  6. User – In order to monitor an Oracle RAC, a special database user account has to be User – In order to monitor an Oracle database server, a special database user account has to be created in every Oracle database instance that requires monitoring. A Click here hyperlink is available in the test configuration page, using which a new oracle database user can be created. Alternatively, you can manually create the special database user. When doing so, ensure that this user is vested with the select_catalog_role and create session privileges.

    The sample script we recommend for user creation (in Oracle database server versions before 12c) for eG monitoring is:

    create user oraeg identified by oraeg create role oratest;

    grant create session to oratest;

    grant select_catalog_role to oratest;

    grant oratest to oraeg;

    The sample script we recommend for user creation (in Oracle database server 12c) for eG monitoring is:

    alter session set container=<Oracle_service_name>;

    create user <user_name>identified by <user_password> container=current default tablespace <name_of_default_tablespace> temporary tablespace <name_of_temporary_tablespace>;

    Grant create session to <user_name>;                                 

    Grant select_catalog_role to <user_name>;

    The name of this user has to be specified here.

  7. Password – Password of the specified database user
  8. Confirm password – Confirm the password by retyping it here.
  9. ISPASSIVE – If the value chosen is yes, then the Oracle server under consideration is a passive server in an Oracle cluster. No alerts will be generated if the server is not running. Measures will be reported as "Not applicable" by the agent if the server is not up.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Active sessions:

Indicates the current number of sessions in which events of this wait class are currently active.

Number

A high value indicates that too many sessions are waiting owing to the events of a particular wait class. To know more about these sessions, the wait events that each session triggered, and which query triggered the events, use the detailed diagnosis of this measure. With the help of the detailed metrics, you can quickly isolate the queries that require optimization.  

Max wait time:

Indicates the maximum time for which the Oracle server has waited for events of this wait class.

Secs

A high value is indicative of the following:

  • An increase in load (either more users, more calls, or larger transactions)
  • I/O performance degradation (I/O time increases and wait time increases, so DB time increases)
  • Application performance degradation
  • CPU-bound host (foregrounds accumulate active run-queue time, wait event times are artificially inflated)

Compare the value of this measure across wait classes to identify which wait class has caused the Oracle database server to wait for the maximum time. You can then use the detailed diagnostics reported by the Active sessions measure to identify which sessions were impacted, and what queries were executed by those sessions to increase wait time. Inefficient queries can thus be identified and optimized to ensure that waiting is eliminated or at least minimized.