Event Log Test

This test reports the statistical information about the events generated by various applications and windows services and drivers in the target system. This test is disabled by default. This test is disabled by default. To enable the test, go to the enable / disable tests page using the menu sequence : Agents -> Tests -> Enable/Disable, pick Eventlog as the Component type, set Performance as the Test type, choose the test from the disabled tests list, and click on the < button to move the test to the ENABLED TESTS list. Finally, click the Update button.

Target of the test : Any host system

Agent deploying the test : An internal agent

Outputs of the test : One set of results for server being monitored.

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed

Host

The IP address of the host for which this test is to be configured.

Port

Refers to the port used by the EventLog Service.  Here it is null.

EventHost

This is the same as the Host.

EventSrc

Enter the specific events to be monitored in the EventSrc text box. The name of the event source can be obtained from the Event Viewer window that appears on following the menu sequence: Start -> Programs -> Administrative Tools -> Event Viewer (If the Programs menu does not contain the Administrative Tools option, then check Start->Settings ->Control Panel for the same). The value that appears in the Source column of this window should be used to specify the EventSrc parameter. 

By default, "All" will be displayed against EventSrc indicating that all events will be monitored by default. While specifying multiple events, make sure that they are separated by commas (,).

Note:

The EventSrc specified should be exactly the same as that which appears in the Event Viewer window. 

Excludedsrc

If specific events are to be excluded from monitoring, then specify the events to be excluded in the Excludedsrc text box, as a comma-separated list.

UseWMI

The eG agent can either use WMI to extract event log statistics or directly parse the event logs using event log APIs. If the UseWMI flag is Yes, then WMI is used. If not, the event log APIs are used. This option is provided because on some Windows systems (especially ones with service pack 3 or lower), the use of WMI access to event logs can cause the CPU usage of the WinMgmt process to shoot up. On such systems, set the UseWMI parameter value to No. On the other hand, when monitoring systems that are operating on any other flavor of Windows (say, Windows 2012 or above), the UseWMI flag should always be set to ‘Yes’.

Stateless Alerts

Typically, the eG manager generates email alerts only when the state of a specific measurement changes. A state change typically occurs only when the threshold of a measure is violated a configured number of times within a specified time window. While this ensured that the eG manager raised alarms only when the problem was severe enough, in some cases, it may cause one/more problems to go unnoticed, just because they did not result in a state change. For example, take the case of the EventLog test. When this test captures an error event for the very first time, the eG manager will send out a CRITICAL email alert with the details of the error event to configured recipients. Now, the next time the test runs, if a different error event is captured, the eG manager will keep the state of the measure as CRITICAL, but will not send out the details of this error event to the user; thus, the second issue will remain hidden from the user. To make sure that administrators do not miss/overlook critical issues, the eG Enterprise monitoring solution provides the stateless alerting capability. To enable this capability for this test, set the Stateless Alerts flag to Yes. This will ensure that email alerts are generated for this test, regardless of whether or not the state of the measures reported by this test changes.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD Frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enabled/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Application errors

This refers to the number of application error events that were generated. 

Number

A very low value (zero) indicates that the system is in a healthy state and all applications are running smoothly without any potential problems.

An increasing trend or high value indicates the existence of problems like loss of functionality or data in one or more applications.

Please check the Application Logs in the Event Log Viewer for more details.

Application information messages

This refers to the number of application information events generated when the test was last executed.

Number

A change in the value of this measure may indicate infrequent but successful operations performed by one or more applications.

Please check the Application Logs in the Event Log Viewer for more details.

Application warnings

This refers to the number of warnings that were generated when the test was last executed.

Number

A high value of this measure indicates application problems that may not have an immediate impact, but may cause future problems in one or more applications.

Please check the Application Logs in the Event Log Viewer for more details.

Application critical errors

Indicates the number of critical events that were generated when the test was last executed.

Number

A critical event is one that an application or a component cannot automatically recover from.

A very low value (zero) indicates that the system is in a healthy state and all applications are running smoothly without any potential problems.

An increasing trend or high value indicates the existence of fatal/irrepairable problems in one or more applications.

The detailed diagnosis of this measure describes all the critical application  events that were generated during the last measurement period.

Please check the Application Logs in the Event Log Viewer for more details.

Application verbose

Indicates the number of verbose events that were generated when the test was last executed.

Number

Verbose logging provides more details in the log entry, which will enable you to troubleshoot issues better.

The detailed diagnosis of this measure describes all the verbose events that were generated during the last measurement period.

Please check the Application Logs in the Event Log Viewer for more details.

System errors

This refers to the number of system error events generated during the last execution of the test.

Number

A very low value (zero) indicates that the system is in healthy state and all Windows services and low level drivers are running without any potential problems.

An increasing trend or a high value indicates the existence of problems such as loss of functionality or data in one or more Windows services and low level drivers.

Please check the System Logs in the Event Log Viewer for more details.

System information messages

This refers to the number of service-related and driver-related information events that were generated during the test's last execution.

Number

A change in value of this measure may indicate infrequent but successful operations performed by one or more applications.

Please check the System Logs in the Event Log Viewer for more details.

System warnings

This refers to the number of service-related and driver-related warnings generated in the during the test's last execution.

Number

A high value of this measure indicates problems that may not have an immediate impact, but may cause future problems in one or more Windows servers and low level drivers.

Please check the System Logs in the Event Log Viewer for more details.

System critical errors

Indicates the number of critical events that were generated when the test was last executed.

Number

A critical event is one that a system cannot automatically recover from.

A very low value (zero) indicates that the system is in a healthy state and is running smoothly without any potential problems.

An increasing trend or high value indicates the existence of fatal/irrepairable problems in the system.

The detailed diagnosis of this measure describes all the critical system events that were generated during the last measurement period.

Please check the System Logs in the Event Log Viewer for more details.

System verbose

Indicates the number of verbose events that were generated when the test was last executed.

Number

Verbose logging provides more details in the log entry, which will enable you to troubleshoot issues better.

The detailed diagnosis of this measure describes all the verbose events that were generated during the last measurement period.

Please check the System Logs in the Event Log Viewer for more details.

Note:

The Stateless Alerting capability is currently available for the following tests alone, by default:

  • EventLog test
  • Application EventLog test
  • System EventLog test
  • Application Events test
  • System Events test
  • Security Log test
  • Account Management Events test

If need be, you can enable the stateless alerting capability for other tests. To achieve this, follow the steps given below:

  • Login to the eG manager host.
  • Edit the eg_specs.ini file in the <EG_INSTALL_DIR>\manager\config directory.
  • Locate the test for which the Stateless Alarms flag has to be enabled.
  • Insert the entry, -statelessAlerts yes, into the test specification as depicted below:

    EventLogTest::$hostName:$portNo=$hostName, -auto, -host $hostName -port $portNo -eventhost $hostIp -eventsrc all -excludedSrc none -useWmi yes -statelessAlerts yes -ddFreq 1:1 -rptName $hostName, 300

  • Finally, save the file.

If need be, you can change the status of the statelessAlerts flag by reconfiguring the test in the eG administrative interface.

Once the stateless alerting capability is enabled for a test (as discussed above), you will find that everytime the test reports a problem, the eG manager does the following:

  • Closes the alarm that pre-exists for that problem;
  • Sends out a normal alert indicating the closure of the old problem;
  • Opens a new alarm and assigns a new alarm ID to it;
  • Sends out a fresh email alert to the configured users, intimating them of the new issue.

In a redundant manager setup, the secondary manager automatically downloads the updated eg_specs.ini file from the primary manager, and determines whether the stateless alerting capability has been enabled for any of the tests reporting metrics to it.  If so, everytime a threshold violation is detected by such a test, the secondary manager will perform the tasks discussed above for the problem reported by that test. Similarly, the primary manager will check whether the stateless alert flag has been switched on for any of the tests reporting to it, and if so, will automatically perform the above-mentioned tasks whenever those tests report a deviation from the norm.

Note:

  • Since alerts will be closed after every measurement period, alarm escalation will no longer be relevant for tests that have statelessAlerts set to yes.
  • For tests with statelessAlerts set to yes, statelessAlerts will apply for all measurements of that test (i.e., it will not be possible to only have one of the measurements with stateless alerts and others without).
  • If statelessAlerts is set to yes for a test, an alarm will be opened during one measurement period (if a threshold violation happens) and will be closed prior to the next measurement period. This way, if a threshold violation happens in successive measurement periods, there will be one alarm per measurement period. This will reflect in all the corresponding places in the eG Enterprise system. For example, multiple alerts in successive measurement periods will result in multiple trouble tickets being opened (one for each measurement period). Likewise, the alarm history will also show alarms being opened during a measurement period and closed during the next measurement period.