Oracle RAC Waits Response Test

This test reports the key performance statistics pertaining to the following wait events in each Oracle instance:

  • log file parallel write: This event occurs when writing redo records to the redo log files from the log buffer. Writing redo records to the redo log files from the log buffer.
  • Db file parallel write: This event occurs in the DBWR. It indicates that the DBWR is performing a parallel write to files and blocks. When the last I/O has gone to disk, the wait ends.
  • log file sync: When a user session commits, the session’s redo information needs to be flushed to the redo logfile. The user session will post the LGWR to write the log buffer to the redo log file. When the LGWR has finished writing, it will post the user session.
  • Db file sequential read: The session waits while a sequential read from the database is performed. This event is also used for rebuilding the control file, dumping datafile headers, and getting the database file headers.

Effective wait analysis helps determine on which wait event the instance spends most of its time, and which current connections are responsible for the above-mentioned wait events.

Target of the test : Oracle RAC

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for every wait event type captured on every instance in the monitored Oracle RAC.

Configurable parameters for the test
  1. TEST PERIOD - How often should the test be executed.
  2. Host – The host for which the test is to be configured.
  3. Port - The port on which the server is listening.
  4. orasid - The variable name of the oracle instance.
  5. service name - A ServiceName exists for the entire Oracle RAC system. When clients connect to an Oracle cluster using the ServiceName, then the cluster routes the request to any available database instance in the cluster. By default, the service name is set to none. In this case, the test connects to the cluster using the orasid and pulls out the metrics from that database instance which corresponds to that orasid. If a valid service name is specified instead, then, the test will connect to the cluster using that service name, and will be able to pull out metrics from any available database instance in the cluster.

    To know the ServiceName of a cluster, execute the following query on any node in the target cluster:

    select name, value from v$parameter where name =’service_names’

  6. User – In order to monitor an Oracle database server, a special database user account has to be created in every Oracle database instance that requires monitoring. A Click here hyperlink is available in the test configuration page, using which a new oracle database user can be created. Alternatively, you can manually create the special database user. When doing so, ensure that this user is vested with the select_catalog_role and create session privileges.

    The sample script we recommend for user creation (in Oracle database server versions before 12c) for eG monitoring is:

    create user oraeg identified by oraeg ;

    create role oratest;

    grant create session to oratest;

    grant select_catalog_role to oratest;

    grant oratest to oraeg;

    The sample script we recommend for user creation (in Oracle database server 12c) for eG monitoring is:

    alter session set container=<Oracle_service_name>;

    create user <user_name>identified by <user_password> container=current default tablespace <name_of_default_tablespace> temporary tablespace <name_of_temporary_tablespace>;

    Grant create session to <user_name>;                                 

    Grant select_catalog_role to <user_name>;

    The name of this user has to be specified here.

  7. Password – Password of the specified database user
  8. Confirm password – Confirm the password by retyping it here.
  9. ISPASSIVE – If the value chosen is yes, then the Oracle server under consideration is a passive server in an Oracle cluster. No alerts will be generated if the server is not running. Measures will be reported as “Not applicable’ by the agent if the server is not up.
  10. To make diagnosis more efficient and accurate, the eG Enterprise suite embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

    The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

    • The eG manager license should allow the detailed diagnosis capability
    • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Total waits:

Indicates the total number of times this wait event has occurred since the last measurement period.

Number

If the value of this measure is very high, then you can drill down further using the detailed diagnosis capability (if enabled) of the eG Enterprise suite to figure out which current connections may be responsible for this. The detailed diagnosis of this measure reveals the session IDs of the sessions that caused the wait events to occur, the users who initiated the sessions, and the total number of waits, wait time, and the maximum wait time for every session.

Time waited:

Indicates the total time for which the events of this type were in existence on this instance.

Seconds

Ideally, the value of this measure should be low.

Average wait time:

Indicates the average duration of time in which this wait event was persistent since the last measurement period.

Seconds

Ideally, the value of this measure should be low. A very high value or a consistent increase in this value is indicative of a problem condition which requires further investigation. Use the detailed diagnosis capability to zoom into the session that has contributed to the abnormal increase in wait time.