Oracle RAC SGA Test

The System Global Area (SGA) is the most important memory structure in Oracle. The SGA stores several different components of memory usage that are designed to execute processes to obtain data for user queries as quickly as possible while also maximizing the number of concurrent users that can access the Oracle instance. The main components of the SGA are

  • The buffer cache: This area of memory allows for selective performance gains on obtaining and changing data. The buffer cache stores data blocks that contain row data that has been selected or updated recently. When the user wants to select data from a table, Oracle looks in the buffer cache to see if the data block that contains the row has already been loaded. If it has, then the buffer cache has achieved its selective performance improvement by not having to look for the data block on disk. If not, then Oracle must locate the data block that contains the row, load it into memory, and present the selected output to the user.
  • The shared pool: The two main components of the shared pool are the shared SQL library cache and the data dictionary cache. The shared SQL library cache is designed to store parse information for SQL statements executing against the database. Parse information includes the set of database operations that the SQL execution mechanism will perform in order to obtain data requested by the user processes. This information is treated as a shared resource in the library cache. If another user process comes along wanting to run the same query that Oracle has already parsed for another user, the database will recognize the opportunity for reuse and let the user process utilize the parse information already available in the shared pool. The other component of the shared pool is the data dictionary cache, also referred to by many DBAs as the “row” cache. This memory structure is designed to store the data from the Oracle data dictionary in order to improve response time on data dictionary queries. Since all user processes and the Oracle database internal processes use the data dictionary, the database as a whole benefits in terms of performance from the presence of cached dictionary data in memory.

An Oracle database server brings in data into the SGA before doing any operation on it. So it is critical to monitor the various structures inside the SGA to ensure optimal database performance. The Oracle RAC SGA test collects a variety of statistics relating to the various SGA components on each node in an Oracle cluster.

Target of the test : Oracle Cluster

Agent deploying the test : An internal agent

Outputs of the test : One set of results for every node in the Oracle cluster

Configurable parameters for the test
  1. TEST PERIOD - How often should the test be executed
  2. Host – The host for which the test is to be configured
  3. Port - The port on which the server is listening
  4. orasid - The variable name of the oracle instance
  5. service name - A ServiceName exists for the entire Oracle RAC system. When clients connect to an Oracle cluster using the ServiceName, then the cluster routes the request to any available database instance in the cluster. By default, the service name is set to none. In this case, the test connects to the cluster using the orasid and pulls out the metrics from that database instance which corresponds to that orasid. If a valid service name is specified instead, then, the test will connect to the cluster using that service name, and will be able to pull out metrics from any available database instance in the cluster.

    To know the ServiceName of a cluster, execute the following query on any node in the target cluster:

    select name, value from v$parameter where name =’service_names’

  6. User – In order to monitor an Oracle RAC, a special database user account has to be User – In order to monitor an Oracle database server, a special database user account has to be created in every Oracle database instance that requires monitoring. A Click here hyperlink is available in the test configuration page, using which a new oracle database user can be created. Alternatively, you can manually create the special database user. When doing so, ensure that this user is vested with the select_catalog_role and create session privileges.

    The sample script we recommend for user creation (in Oracle database server versions before 12c) for eG monitoring is:

    create user oraeg identified by oraeg create role oratest;

    grant create session to oratest;

    grant select_catalog_role to oratest;

    grant oratest to oraeg;

    The sample script we recommend for user creation (in Oracle database server 12c) for eG monitoring is:

    alter session set container=<Oracle_service_name>;

    create user <user_name>identified by <user_password> container=current default tablespace <name_of_default_tablespace> temporary tablespace <name_of_temporary_tablespace>;

    Grant create session to <user_name>;                                 

    Grant select_catalog_role to <user_name>;

    The name of this user has to be specified here.

  7. Password – Password of the specified database user
  8. Confirm password – Confirm the password by retyping it here.
  9. SSL- By default, this flag is set to No, as the target Oracle cluster is not SSL-enabled by default. If the target cluster is SSL-enabled, then set this flag to Yes.
  10. SSL Cipher-This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none. A cipher suite is a set of cryptographic algorithms that are used before a client application and server exchange information over an SSL/TLS connection. It consist of sets of instructions on how to secure a network through SSL (Secure Sockets Layer) or TLS (Transport Layer Security). In this text box, provide a comma-seperated list of cipher suites that are allowed for SSL/TLS connection to the target cluster. By default, this parameter is set to none.
  11. TRUSTSTORE FILE- This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none. TrustStore is used to store certificates from Certified Authorities (CA) that verify and authenticate the certificate presented by the server in an SSL connection. Therefore, the eG agent should have access to the truststore where the certificates are stored to authenticate and connect with the target cluster and collect metrics. For this, first import the certificates into the following default location <eG_INSTALL_DIR>/lib/security/mytruststore.jks. To know how to import the certificate into the truststore, refer toPre-requisites for monitoring Oracle Cluster. Then, provide the truststore file name in this text box. For example: mytruststore.jks. By default, none is specified against this text box.
  12. TRUSTSTORE TYPE-This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none.Specify the type of truststore that contains the certificates for server authentication in this text box. For eg.,JKS. By default, this parameter is set to the value none.
  13. TRUSSTORE PASSWORD-This parameter is applicable only if the target Oracle Cluster is SSL-enabled, if not, set this parameter to none. If a Truststore File name is provided, then, in this text box, provide the password that is used to obtain the associated certificate details from the Truststore File. By default, this parameter is set to none.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Library cache hit ratio:

The library cache is a buffer that contains the shared SQL and PL/SQL areas. The library cache hit ratio indicates the percentage of shared SQL statements being reparsed by this cluster node.

Percent

For a well-tuned database, this ratio is 90% or more. A lower hit ratio may indicate that the memory allocation to the library cache is insufficient. A low value can significantly degrade the database performance. Increasing the value of the SHARED_POOL_SIZE initialization parameter will help in improving the hit ratio.

Data buffer cache hit ratio:

Indicates the percentage of time that this cluster node is able to satisfy a request with information that is already available in the memory.

Percent

Physical I/O takes a significant amount of time, and also increases the CPU resources required. The database configuration should be tuned to ensure that a required block will most likely be in memory. The extent to which this is achieved is measured using the buffer cache hit ratio. For a well-tuned database, this ratio should be 80% or higher. A lower value indicates insufficient memory allocation to the database buffer cache. Increasing the value of the DB_BLOCK_BUFFERS initialization parameter will help in improving the hit ratio. If you are monitoring Oracle 9i or higher, then, note that the DB_BLOCK_BUFFERS parameter is not supported in Oracle 9i or above. It is therefore recommended that you use the equivalent DB_CACHE_SIZE parameter instead.

Dictionary cache hit ratio:

Indicates the percentage of data dictionary information pertaining to the database, file space availability and object privileges being readily available in this cluster node’s memory.

Percent

As with the case of the library cache, the dictionary cache hit ratio should be at least 90%. A lower value may be due to the insufficient memory allocation to the dictionary cache. Increasing the value of the SHARED_POOL_SIZE parameter will help in improving the hit ratio.

Redo log buffer misses:

Indicates the percentage of requests to this cluster node that had to wait before the redolog buffer is allocated to it.

Percent

Before any transaction could occur, the before image of the data will be stored in the redo log buffer.

It is crucial to make the redo log buffer available immediately to the transactions without any wait. The above is crucial to improve the overall performance. This measure indicates how many percentage of times it had to wait for a redo log buffer to be allocated. This can be improved by increasing the LOG_BUFFER parameter.

Sorts on disk:

Indicates the percentage of sorts that is happening on the secondary storage disk of this cluster node.

Percent

For best performance, most sorts should occur in memory; sorts written to disk adversely affect performance. If more than 10% of sorts happen on disk, the database performance could degrade. To improve the sorting performance of a database, consider tuning the parameters SORT_AREA_SIZE and SORT_AREA_RETAINED_SIZE. The dynamically modifiable initialization parameter called SORT_AREA_SIZE specifies the maximum amount of memory to use for each sort. If a significant number of sorts require disk I/O to temporary segments, an application’s performance may benefit from increasing the size of the sort area. Oracle 9i (or above) supports the SORT_AREA_SIZE and the SORT_AREA_RETAINED_SIZE parameters only to ensure backward compatibility with previous versions of Oracle. Therefore, while monitoring Oracle 9i or higher, it is recommended that you use the equivalent PGA_AGGREGATE_TARGET parameter instead.

Current size:

Indicates the amount of space allocated to the SGA that is currently in use.

MB

A consistent and significant increase in the value of this measure is a cause for concern, as it indicates that SGA components are over-utilizing the available memory resources.

In such a scenario, you can use the detailed diagnosis of the Current size measure to know the memory usage of the individual SGA components. In the process, you can identify the exact SGA component that is over-utilizing the memory resources.

Buffer nowait:

Indicates the percentage of requests a server process makes for a specific buffer where the buffer was available immediately.

Percent

If this ratio falls below 90%, it indicates that the server process has to wait for something before obtaining the buffer. In this case, determine which type of block is being contended for by examining the Buffer Waits Section of Statspack/ AWR report.

Soft parse:

Indicates the percentage of parse requests where the cursor was already in the cursor cache compared to the number of total parses.

 

Percent

A soft parse is recorded when the Oracle Server checks the shared pool for a SQL statement and finds a version of the statement that it can reuse.

If the value of this measure falls below 90%, it indicates that very often server processes are unable to find SQL statements in the shared pool and are forced to perform hard parses for these statements.

Soft parses consume less resources than hard parses, so the larger the value for this item, the better. 

Execute to parse:

Is a measure of how many times you execute a sql statement versus parse it.

Percent

If this value is too low, it indicates that an application is parsing statements highly, but not executing properly. This could result in excessive CPU usage, increased shared pool latches, and serious performance degradations in the Oracle database server.

The execute to parse ratio takes a hit when an application does not use shareable SQL or if the database has sub-optimal parameters that are reducing the effectiveness of cursor sharing. A problem like excessive parsing is likely to manifest itself as additional network traffic between the application server and clients. The additional parse activity may also show up as a marked increase in CPU consumption on the database server.

Parse CPU to parse elapsed:

Indicates the percentage of CPU time used when parsing.

Percent

Parse CPU time means amount of CPU time used for parsing. Parse Elapsed time means amount of clock time used for parsing – this is actually the sum of Parse CPU time and Parse Wait time.

The Parse CPU to parse elapsed ratio is calculated using the formula:

(Parse CPU time /Parse elapsed time)*100

Ideally, Parse elsapsed must be equal to Parse CPU - i.e., only CPU time should be used for parsing. In that case the ratio will be 100%. However, if wait time is more then this ratio will be less.

A low value for this ratio is an indicator of latching problems. Investigate the latch sections in AWR and Statspack report for contention on library cache and shared pool latches.

CPU to non-parse:

Indicates the percentage of CPU time spent for activities other than parsing the SQLs.

Percent

The closer the value of this measure is to 100, better will be the performance of the server. This is because, such a value means that your CPU works on executing your queries instead of parsing them.

Hard parse ratio:

Indicates the percentage of hard parses.

Percent

Hard parsing happens when the oracle server parses a query and cannot find an exact match for the query in the library cache. A hard parse is a very expensive operation both in terms of CPU used and in the number of latches that gets performed. This is why, the value of this measure should be very low.

One of the common reasons for high hard parse ratio is the inefficient sharing of SQL statements.

SGA usage:

Indicates the percentage of the target SGA size that is in use currently.

Percent

The SGA_TARGET_SIZE is the total size of all SGA components. You can use this measure to know how much of the configured target SGA size is being used.

If this value is close to 100%, it is a cause for concern, as it indicates that the SGA is about to run out of memory. This in turn can slow down user accesses and query execution. In such a scenario, you can use the detailed diagnosis of the Current size measure to know the memory usage of the individual SGA components. In the process, you can identify the exact SGA component that is over-utilizing the memory resources.