Oracle RAC SQL Network Test

Using the JDBC API, this test reports the availability and responsiveness of each node in the cluster, and collects statistics pertaining to the traffic into and out of every node.

Target of the test : Oracle Cluster

Agent deploying the test : An external agent; if you are running this test using the external agent on the eG manager box, then make sure that this external agent is able to communicate with the port on which the target Oracle server is listening. Alternatively, you can deploy the external agent that will be running this test on a host that can access the port on which the target Oracle server is listening.

Outputs of the test : One set of results for each node in the Oracle cluster being monitored

Configurable parameters for the test
  1. TEST PERIOD - How often should the test be executed
  2. Host – The host for which the test is to be configured
  3. Port - The port on which the server is listening
  4. orasid - The variable name of the oracle instance
  5. service name - A ServiceName exists for the entire Oracle RAC system. When clients connect to an Oracle cluster using the ServiceName, then the cluster routes the request to any available database instance in the cluster. By default, the service name is set to none. In this case, the test connects to the cluster using the orasid and pulls out the metrics from that database instance which corresponds to that orasid. If a valid service name is specified instead, then, the test will connect to the cluster using that service name, and will be able to pull out metrics from any available database instance in the cluster.

    To know the ServiceName of a cluster, execute the following query on any node in the target cluster:

    select name, value from v$parameter where name =’service_names’

  6. User – In order to monitor an Oracle RAC, a special database user account has to be User – In order to monitor an Oracle database server, a special database user account has to be created in every Oracle database instance that requires monitoring. A Click here hyperlink is available in the test configuration page, using which a new oracle database user can be created. Alternatively, you can manually create the special database user. When doing so, ensure that this user is vested with the select_catalog_role and create session privileges.

    The sample script we recommend for user creation (in Oracle database server versions before 12c) for eG monitoring is:

    create user oraeg identified by oraeg create role oratest;

    grant create session to oratest;

    grant select_catalog_role to oratest;

    grant oratest to oraeg;

    The sample script we recommend for user creation (in Oracle database server 12c) for eG monitoring is:

    alter session set container=<Oracle_service_name>;

    create user <user_name>identified by <user_password> container=current default tablespace <name_of_default_tablespace> temporary tablespace <name_of_temporary_tablespace>;

    Grant create session to <user_name>;                                 

    Grant select_catalog_role to <user_name>;

    The name of this user has to be specified here.

  7. Password – Password of the specified database user
  8. Confirm password – Confirm the password by retyping it here.
  9. individual node–By default, this flag is set to Yes, indicating that this test will report metrics for every node in the cluster by default. You can set this flag to No to ensure that the test reports the availability and responsiveness of the cluster service as a whole, and not the individual cluster nodes.
  10. To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

    The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

    • The eG manager license should allow the detailed diagnosis capability
    • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Oracle cluster node availability:

Whether the cluster node is responding to requests.

Percent

The availability is 100% when a cluster node is responding to a request and 0% when it is not. Availability problems may be caused by a misconfiguration/malfunctioning of the node, or because the node is using an invalid user account. Besides the above, this measure will report that the server is unavailable even if a connection to the node is unavailable, or if a query to the node fails. In this case, you can check the values of the DB connection availability and Query processor availability measures to know what is exactly causing the node to not respond to requests - is it owing to a connection unavailability? or is it due to a query failure?

Total response time:

The time taken by this node to respond to a user query. This is the sum total of the connection time and query execution time.

Secs

A sudden increase in response time is indicative of a bottleneck at the node. This could even be owing to a connection delay and/or long running queries to the node. Whenever the value of this measure is high, it would be good practice to compare the values of the Connection time and Query execution time measures for a node to zero-in on the root-cause of the poor responsiveness of the server - is it because of connectivity issues? or is it because of inefficient queries?

Data transmit rate:

The rate of data being transmitted by this node in response to client requests.

KB/Sec

The data transmission rate reflects the workload on the server.

Data receive rate:

The rate of data received by this node from clients over SQL*Net.

KB/Sec

This measure also characterizes the workload on a node. As the data rate to a node increases, consider tuning the Service Layer Data Buffer (SDU) and the Transport Layer Data Buffer (BDU) in the TNSNames.ora and Listener.ora files to optimize packet transfers across the network.

Cluster node connection availability:

Indicates whether the database connection to this node is available or not.

Percent

If this measure reports the value 100, it indicates that the database connection is available.  The value 0 on the other hand indicates that the database connection is unavailable. A connection to the database may be unavailable if the database is down or if the database is listening on a port other than the one configured for it in the eG manager or owing to a poor network link. If the Oracle server availability measure reports the value 0, then, you can check the value of this measure to determine whether/not it is due to the unavailability of a connection to the server.

Connection time to cluster node:

Indicates the time taken to connect to the cluster node.

Secs

A high value could indicate a connection bottleneck. Whenever the Total response time of the measure soars, you may want to check the value of this measure to determine whether a connection latency is causing the poor responsiveness of the node.

Query processor availability:

Indicates whether the query to this node is executed successfully or not.

Percent

If this measure reports the value 100, it indicates that the query executed successfully.  The value 0 on the other hand indicates that the query failed. In the event that the Oracle server availability measure reports the value 0, check the value of this measure to figure out whether the failed query is the reason why that measure reported a node unavailability. 

Query execution time:

Indicates the time taken for query execution.

Secs

A high value could indicate that one/more queries to the node are taking too long to execute. Inefficient/badly designed queries to the database often take too long to execute. If the value of this measure is higher than that of the Connection time measure, you can be rest assured that long running queries are causing the node to respond slowly to requests.