Uptime - OS Test
In most environments, it is essential to monitor the uptime of Linux systems hosting popular applications in the infrastructure. By tracking the uptime of each of the Linux systems, administrators can determine what percentage of time a system has been up. Comparing this value with service level targets, administrators can determine the most trouble-prone areas of the infrastructure.
In some environments, administrators may schedule periodic reboots of their Linux systems. By knowing that a specific Linux system has been up for an unusually long time, an administrator may come to know that the scheduled reboot task is not working on a Linux system.
This test included in the eG agent monitors the uptime of each Linux system.
Target of the test : A Linux Systems Group
Agent deploying the test : A remote agent
Outputs of the test : One set of results for every Linux system
Parameter | Description |
---|---|
Test Period |
How often should the test be executed. |
Host |
The nick name of the target component for which this test is to be configured. |
Port |
Refers to the port at which the specified host listens to. By default, this is NULL. |
Inside View Using |
To obtain the 'inside view' of performance of the systems - i.e., to measure the internal performance of the systems - this test uses a light-weight eG VM Agent software deployed on each of the systems. Accordingly, this parameter is by default set to eG VM Agent. |
Report By User |
This flag is set to No by default. This implies that the Linux systems in environments will always be identified using the system name. In other words, this test will, by default, report measures for every systemname. On the other hand, if you want this test to report the measures for every user on a system, then set this flag to Yes. In such a case, this test will report the measures for every username_on_systemname. |
Report Powered OS |
By default, this flag is set to Yes, then the 'inside view' tests will report measures for even those Linux systems that do not have any users logged in currently. The systems will be identified by their name and not by the username_on_systemname. On the other hand, if this flag is set to No, then this test will not report measures for those systems to which no users are logged in currently. |
Is Cloud VMs? |
This flag is set to Yes by default. The value of this flag cannot be changed. This implies that the cloud-based Linux systems in environments will always be identified using the login name of the user. In other words, in cloud environments, this test will, by default, report measures for every username_on_systemname. |
Report Manager Time |
By default, this flag is set to Yes, indicating that, by default, the detailed diagnosis of this test, if enabled, will report the shutdown and reboot times of the physical desktops in the manager’s time zone. If this flag is set to No, then the shutdown and reboot times are shown in the time zone of the system where the agent is running (i.e., the system on which the remote agent is running). |
DD Frequency |
Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency. |
Detailed Diagnosis |
To make diagnosis more efficient and accurate, the eG suite embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:
|
Measurement | Description | Measurement Unit | Interpretation |
---|---|---|---|
Has the system been rebooted? |
Indicates whether the system has been rebooted during the last measurement period or not. |
Boolean |
If this measure shows 1, it means that the guest was rebooted during the last measurement period. By checking the time periods when this metric changes from 0 to 1, an administrator can determine the times when this guest was rebooted. |
Uptime of system during the last measure period |
Indicates the time period that the system has been up since the last time this test ran. |
Secs |
If the guest has not been rebooted during the last measurement period and the agent has been running continuously, this value will be equal to the measurement period. If the guest was rebooted during the last measurement period, this value will be less than the measurement period of the test. For example, if the measurement period is 300 secs, and if the guest was rebooted 120 secs back, this metric will report a value of 120 seconds. The accuracy of this metric is dependent on the measurement period - the smaller the measurement period, greater the accuracy. |
Total uptime of the system |
Indicates the total time that the system has been up since its last reboot. |
|
This measure displays the number of years, months, days, hours, minutes and seconds since the last reboot. Administrators may wish to be alerted if a system has been running without a reboot for a very long period. Setting a threshold for this metric allows administrators to determine such conditions. |