Oracle Dead Kill Processes Test

If one/more sessions or processes on the Oracle server are obstructing the execution of a few other sessions/processes, then, it is quiet natural for administrators to want to kill the blocking sessions/processes to ensure the smooth execution of critical database transactions. Typically, these ‘dead’ sessions/processes continue to consume resources, until the PMON process automatically cleans up these sessions/processes. If cleanup is delayed, then the Oracle instance will not be able to release those objects and resources that have been locked by the dead sessions/processes for long time periods.  In such situations, administrators often resort to killing these dead sessions/processes at the operating system-level, so as to hasten the release of valuable resources. Before attempting the OS-level kill, administrators should first figure out which sessions/processes are ‘dead’ presently and how long they have been ‘dead’. This can be ascertained using the Oracle Dead Kill Processes test.

This test auto-discovers the dead processes/sessions and reports the current cleanup state of each process/session. In addition, the test reveals the duration for which each process/session remained dead and the count of processes that are being blocked by that dead process/session. This way, administrators can determine whether/not cleanup is occurring as per schedule, and if not, how badly the delay in cleanup is affecting other processes. Alongside, administrators can figure out whether an OS-level process kill is justified or not.

Target of the test : An Oracle 12c server

Agent deploying the test : An internal agent

Outputs of the test : One set of results for deadprocessaddress_deadsessionaddress on the Oracle instance monitored.

Configurable parameters for the test
  1. TEST PERIOD - How often should the test be executed
  2. Host – The host for which the test is to be configured
  3. Port - The port on which the server is listening
  4. User – In order to monitor an Oracle database server, a special database user account has to be created in every Oracle database instance that requires monitoring. A Click here hyperlink is available in the test configuration page, using which a new oracle database user can be created. Alternatively, you can manually create the special database user. When doing so, ensure that this user is vested with the select_catalog_role and create session privileges.

    The sample script we recommend for user creation (in Oracle database server versions before 12c) for eG monitoring is:

    create user oraeg identified by oraeg

    create role oratest;

    grant create session to oratest;

    grant select_catalog_role to oratest;

    grant oratest to oraeg;

    The sample script we recommend for user creation (in Oracle database server 12c) for eG monitoring is:

    alter session set container=<Oracle_service_name>;

    create user <user_name>identified by <user_password> container=current default tablespace <name_of_default_tablespace> temporary tablespace <name_of_temporary_tablespace>;

    Grant create session to <user_name>;                                

    Grant select_catalog_role to <user_name>;

    The name of this user has to be specified here.

  5. Password – Password of the specified database user

    This login information is required to query Oracle’s internal dynamic views, so as to fetch the current status / health of the various database components.

  6. Confirm password – Confirm the password by retyping it here.
  7. listener name – Specify the Oracle listener name. By default, this will be the same as the Oracle SID.
  8. ISPASSIVE – If the value chosen is yes, then the Oracle server under consideration is a passive server in an Oracle cluster. No alerts will be generated if the server is not running. Measures will be reported as “Not applicable" by the agent if the server is not up.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Process state:

 

Indicates the current cleanup state of this process.

 

 

 

 

The values that this measure can report and their corresponding numeric values have been discussed hereunder:

Measure Value Description Numeric Value

unsafe to attempt

Occurs for a killed session that has not been moved, so no cleanup can occur on it yet

1

cleanup pending

Occurs for a dead process / killed session that can be cleaned up, but PMON has not yet made an attempt

 

2

resources freed

Occurs for a dead process / killed session where all children have been freed, but the process / killed session itself is not yet freed

3

resources freed – pending ack

Occurs for a killed session where all children have been freed, but the session itself cannot be freed until the owner has acknowledged it

4

partial cleanup

Occurs if some of the children have been cleaned up

5

Note:

By default, this measure reports the above-mentioned Measure Values while indicating the current cleanup state of a dead process. However, in the graph of this measure, the same will be represented using the corresponding numeric equivalents only.

Dead time:

Indicates how long it has been since this process was marked dead or this session was marked killed.

Secs

A consistent increase in the value of this measure is a cause for concern as it indicates that auto-cleanup has not occurred. This can cause the dead process/session to continue consuming resources and blocking object, thereby degrading server performance.

Number blocked:

Indicates the count of processes that are blocked by this process.

Number

A high value indicates that the dead process is impeding the execution of many other processes, some of which may also be mission-critical.

If the Dead time of such a process is also very high, it is a matter of great concern, and must be looked into immediately.

In such circumstances, you may want to consider killing the process at the OS-level. On a Unix system, you can issue the KILL -9 <PID> command at the Shell prompt to kill the process at that level.