Out of Memory Process Kills Test
In Linux systems, out-of-memory (OOM) kills occur when the kernel exhausts available physical memory and swap space and activates the Linux OOM Killer to reclaim memory by forcibly terminating one or more running processes. This typically happens due to memory leaks, unbounded application memory usage, insufficient RAM or swap configuration, aggressive workload spikes, or improper resource limits. OOM kill events can have serious consequences, as they may abruptly terminate critical services, background daemons, or user processes, leading to application crashes, service interruptions, data inconsistencies, and degraded system reliability. Frequent OOM kills are strong indicators of sustained memory pressure or misbehaving applications and, if not addressed promptly, can result in recurring outages and unstable system behavior. Monitoring OOM kill activity in Linux environments provides early visibility into memory exhaustion scenarios and helps identify which processes are being targeted by the kernel based on their memory footprint and OOM score.
The Out of Memory (OOM) Kills test tracks the number of processes terminated by the Linux OOM Killer and reports detailed diagnostic information such as process name, process ID, memory usage breakdown, user ownership, and OOM score adjustments. This enables administrators to quickly diagnose root causes, optimize application memory usage, fine-tune system limits, and proactively scale memory resources to maintain stable and predictable Linux system performance.
Target of the test : A Linux system
Agent deploying the test : An internal agent
Outputs of the test :One set of results for the target host system being monitored.
| Parameter | Description |
|---|---|
|
Test Period |
How often should the test be executed. |
|
Host |
The host for which the test is to be configured. |
|
Port |
Refers to the port at which the specified host listens to. By default, this is NULL. |
|
DD Frequency |
Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency. |
|
Detailed Diagnosis |
To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:
|
| Measurement | Description | Measurement Unit | Interpretation |
|---|---|---|---|
|
Processes killed due to OS out-of-memory |
Indicates the number of processes that were terminated by the operating system due to out-of-memory (OOM) conditions. |
Number |
This measure helps identify memory pressure events where the OOM Killer was triggered to free up memory. The detailed diagnosis of this measure provides the Time stamp,Process ID,Process name,Total virtual memory(MB),Anonymous memory(MB), File backed memory(KB),Shared memory usage(KB),UserID,Page tables(MB), and OOM score adjustment. |