Oracle RMAN Job Details Test
The Oracle Recovery Manager (RMAN) provides a comprehensive foundation for efficiently backing up and recovering the Oracle database. It is designed to work intimately with the server, providing block-level corruption detection during backup and restore. It provides a common interface, via command line and Enterprise Manager, for backup tasks across different host operating systems and offers features not available through user-managed methods, such as parallelization of backup/restore data streams, backup files retention policy, and detailed history of all backups. Since errors in backup/recovery jobs can result in loss of critical data, it is essential to keep a close watch on the activities of the RMAN. Using the OraRmanJobTest, you can monitor the status of backup/recovery jobs executed by the RMAN so that, you can be forewarned of issues in these critical processes.
This test is disabled by default. To enable the test, go to the enable / disable tests page using the menu sequence : Agents -> Tests -> Enable/Disable, pick Oracle Database as the Component type, Performance as the Test type, choose this test from the disabled tests list, and click on the << button to move the test to the ENABLED TESTS list. Finally, click the Update button.
Note:
This test is applicable for Oracle Database with Multi-tenancy i.e., CDB(Container Database) and PDB (Pluggable Database) configuration.
Target of the test : An Oracle server
Agent deploying the test : An internal agent
Outputs of the test : One set of results for every Oracle server.
| Parameter | Description |
|---|---|
|
Test period |
How often should the test be executed |
|
Host |
The host for which the test is to be configured. |
|
Port |
The port on which the server is listening. |
|
Username |
In order to monitor an Oracle database server, a special database user account has to be created in every Oracle database instance that requires monitoring. A Click here hyperlink is available in the test configuration page, using which a new oracle database user can be created. Alternatively, you can manually create the special database user. When doing so, ensure that this user is vested with the select_catalog_role and create session privileges. The sample script we recommend for user creation (in Oracle database server versions before 12c) for eG monitoring is: create user oraeg identified by oraeg create role oratest; grant create session to oratest; grant select_catalog_role to oratest; grant oratest to oraeg; The sample script we recommend for user creation (in Oracle database server 12c) for eG monitoring is: alter session set container=<Oracle_service_name>; create user <user_name>identified by <user_password> container=current default tablespace <name_of_default_tablespace> temporary tablespace <name_of_temporary_tablespace>; Grant create session to <user_name>; Grant select_catalog_role to <user_name>; The name of this user has to be specified here. |
|
Password |
Specify the password of the specified database user. |
|
Confirm Password |
Confirm the Password by retyping it here. |
|
Elapsed Time |
This test reports an Exceeded time limit jobs measure, which reveals the number of jobs that have been running beyond a time limit (in minutes) that is configured in the Elapsed Time text box. For instance, if the Elapsed Timeis set to 5 minutes, then the Exceeded time limit jobs measure will report the count of jobs that have been running for over 5 minutes. |
|
IsPassive |
If the value chosen is Yes, then the Oracle server under consideration is a passive server in an Oracle cluster. No alerts will be generated if the server is not running. Measures will be reported as “Not applicable" by the agent if the server is not up. |
|
SSL |
By default, this flag is set to No, as the target Oracle database is not SSL-enabled by default. If the target database is SSL-enabled, then set this flag to Yes. |
|
SSL Cipher |
This parameter is applicable only if the target Oracle database is SSL-enabled, if not, set this parameter to none. A cipher suite is a set of cryptographic algorithms that are used before a client application and server exchange information over an SSL/TLS connection. It consist of sets of instructions on how to secure a network through SSL (Secure Sockets Layer) or TLS (Transport Layer Security). In this text box, provide a comma-seperated list of cipher suites that are allowed for SSL/TLS connection to the target database. By default, this parameter is set to none. |
|
Truststore File |
This parameter is applicable only if the target Oracle database is SSL-enabled, if not, set this parameter to none. TrustStore is used to store certificates from Certified Authorities (CA) that verify and authenticate the certificate presented by the server in an SSL connection. Therefore, the eG agent should have access to the truststore where the certificates are stored to authenticate and connect with the target database and collect metrics. For this, first import the certificates into the following default location <eG_INSTALL_DIR>/lib/security/mytruststore.jks. To know how to import the certificate into the truststore, refer toPre-requisites for monitoring Oracle Cluster. Then, provide the truststore file name in this text box. For example: mytruststore.jks. By default, none is specified against this text box. |
|
Truststore Type |
This parameter is applicable only if the target Oracle database is SSL-enabled, if not, set this parameter to none.Specify the type of truststore that contains the certificates for server authentication in this text box. For eg.,JKS. By default, this parameter is set to the value none. |
|
Truststore Password |
This parameter is applicable only if the target Oracle database is SSL-enabled, if not, set this parameter to none. If a Truststore File name is provided, then, in this text box, provide the password that is used to obtain the associated certificate details from the Truststore File. By default, this parameter is set to none. |
|
Keystore File |
This parameter is applicable only if the target Oracle database is SSL-enabled, if not, set this parameter to none. Keystore contains the private keys for the certificates that the client can provide to the server upon request. eG agent requires access to the keystore where client certificate is stored to send that to the server so that the server validates the certificate against the one contained in its trustore. For this purpose, first create the client certificate in the following default location /opt/egurkha/jre/lib/security/egmqsslstore.jks. |
|
Keystore Password |
This parameter is applicable only if the target Oracle database is SSL-enabled, if not, set this parameter to none. If a Keystore File name or file path is provided, then, in this text box, provide the password that is used to obtain the associated certificate details from the Keystore File. |
|
Confirm Password |
Confirm the Password for Keystore by retyping it here. |
|
DD Frequency |
Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency. |
|
Detailed Diagnosis |
To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:
|
| Measurement | Description | Measurement Unit | Interpretation |
|---|---|---|---|
|
Completed jobs: |
Indicates the number of jobs completed during the last measurement period.
|
Number |
Use the detailed diagnosis of this measure to view the details of the completed jobs. |
|
Failed jobs: |
Indicates the count of failed jobs in the last measurement period. |
Number |
Ideally, the value of this measure should be 0. If a non-zero value is reported, use the detailed diagnosis of this measure to determine which jobs failed at what time. |
|
Running jobs: |
Indicates the number of jobs that were running during the last measurement period. |
Number |
Use the detailed diagnosis of this measure to view the details of the jobs that were running. |
|
Jobs running with errors: |
Indicates the number of jobs that were running during the last measurement period, but with errors. |
Number |
Ideally, this value should be low. If the value is high, you may want to check the detailed diagnosis of this measure to know which jobs are running with errors. |
|
Jobs running with warnings: |
Indicates the number of jobs that were running during the last measurement period, but with warnings. |
Number |
Ideally, this value should be low. If the value is high, you may want to check the detailed diagnosis of this measure to know which jobs are running with warnings. |
|
Jobs completed with errors: |
Indicates the number of jobs that were completed during the last measurement period, but with warnings. |
Number |
Ideally, this value should be low. If the value is high, you may want to check the detailed diagnosis of this measure to know which completed jobs have errors. |
|
Jobs completed with warnings: |
Indicates the number of jobs that were completed during the last measurement period, but with errors. |
Number |
Ideally, this value should be low. If the value is high, you may want to check the detailed diagnosis of this measure to know which completed jobs are with warnings. |
|
Jobs that exceeded time limits: |
Indicates the number of jobs that are taking an abnormal amount of time to complete. |
Number |
If this measure reports a non-zero value, then, it indicates that one/more jobs are taking too long to complete. Since such jobs could drain the server of resources, it is imperative that you determine why the jobs are taking so much time to execute, and fix the problem. A possible reason could be that these jobs are waiting for objects that have been locked by other sessions; if these sessions are less-critical, you may want to terminate them in order to enable the jobs to use the locked resources and resume execution. To know the jobs that are taking too long a time, use the detailed diagnosis of this measure. |