- TEST PERIOD - How often should the test be executed
- Host - The host for which the test is to be configured
- Port - The port to which the specified host listens
Process - In the Process text box, enter a comma separated list of names:pattern pairs which identify the process(es) associated with the server being considered. processName is a string that will be used for display purposes only. processPattern is an expression of the form - *expr* or expr or *expr or expr* or *expr1*expr2*... or expr1*expr2, etc. A leading '*' signifies any number of leading characters, while a trailing '*' signifies any number of trailing characters. The pattern(s) used vary from one application to another and must be configured per application. For example, for an iPlanet application server (Nas_server), there are three processes named kcs, kjs, and kxs associated with the application server. For this server type, in the Process text box, enter "kcsProcess:*kcs*, kjsProcess:*kjs*, kxsProcess:*kxs*, where * denotes zero or more characters. Other special characters such as slashes (\) can also be used while defining the process pattern. For example, if a server’s root directory is /home/egurkha/apache and the server executable named httpd exists in the bin directory, then, the process pattern is “*/home/egurkha/apache/bin/httpd*”.
The process parameter supports process patterns containing the ~ character.
To determine the process pattern to use for your application, on Windows environments, look for the process name(s) in the Task Manager -> Processes selection. To determine the process pattern to use on Unix environments, use the ps command (e.g., the command "ps -e -o pid,args" can be used to determine the processes running on the target system; from this, choose the processes of interest to you.)
Also, while monitoring processes on Windows, if the wide parameter of this test is set to true, then your process patterns can include the full path to the process and/or the arguments supported by the process. For instance, your processpattern specification can be as follows:
Terminal:C:\WINDOWS\System32\svchost -k DcomLaunch,Remote:C:\WINDOWS\system32\svchost.exe -k netsvcs
Also, note that the process parameter is case-sensitive in Unix environments.
To save the time and effort involved in such manual process specification, eG Enterprise offers an easy-to-use auto-configure option in the form of a View/Configure button that is available next to the PROCESS text box. Refer to Auto-configuring the Process Patterns to be Monitored to know how to use the auto-configure option.
- user - The USER parameter will work only for Unix platforms and not Windows. By default, this parameter has a value "none", which means the test does not look for a process(es) for a specific user. If the value of the "user" parameter is not "none", then the Processes test searches for all processes of a specific user.
CORRECT - Increased uptime and lower mean time to repair are critical to ensuring that IT infrastructures deliver a high quality of service to users. Towards this end, the eG Enterprise embeds an optional auto-correction capability that enables eG agents to automatically correct problems in the environment, as soon as they occur. With this capability, as and when an abnormal situation is detected, an eG agent can initiate corrective actions automatically to resolve the problem. Automatic correction without the need for manual intervention by IT operations staff reduces service downtime and improves operational efficiency. By default, the auto-correction capability is available in the eG Enterprise for the Processes running measure of Processes test, and the Service availability measure of WindowsServices test. The eG Enterprise includes a default auto-correction script for Processes test.
When a process that has been configured for monitoring stops, this script automatically executes and starts the process. To enable the auto-correction capability for the Processes test, first, select the TRUE option against the CORRECT parameter in this page (by default, FALSE will be selected here).
- ALARMTYPE - Upon selecting the True option, three new parameters, namely, ALARMTYPE, USERPARAMS, and CORRECTIVESCRIPT will appear. You can set the corrective script to execute when a specific type of alarm is generated, by selecting an option from the ALARMTYPE list box. For example, if the Critical option is chosen from the ALARMTYPE list box, then the corrective script will run only when a critical alarm for the Processes test is generated. Similarly, if the Critical/Major option is chosen, then the corrective script will execute only when the eG Enterprise system generates critical or major alarms for the Processes test. In order to ensure that the corrective script executes regardless of the alarm type, select the Critical/Major/Minor option.
USERPARAMS - The user-defined parameters that are to be passed to the corrective script are specified in the USERPARAMS text box. One of the following formats can be applied to the USERPARAMS specification:
exec@processName:command: In this specification, processName is the display name of the process pattern specified against the PROCESS parameter, and command is the command to be executed by the default script when the process(es) represented by the processName stops. For example, assume that the PROCESS parameter of Processes test has been configured in the following manner: Apache:*/opt/egurkha/manager/apache/bin/httpd*,Tomcat:*java*tomcat*, where Apache and Tomcat are the processNames or display names of the configured patterns. If auto-correction is enabled for these processes, then the USERPARAMS specification can be as follows:
exec@Apache:/opt/egurkha/manager/apache/bin/apachectl start,Tomcat: /opt/tomcat/bin/catalina.sh start
This indicates that if the processes configured under the processName "Apache" stop (i.e. */opt/egurkha/manager/apache/bin/httpd*), then the script will automatically execute the command "/opt/egurkha/manager/apache/bin/apachectl start" to start the processes. Similarly, if the "Tomcat" processes (i.e. *java*tomcat*) stop, the script will execute the command "/opt/tomcat/bin/catalina.sh start" to start the processes.
command: In this specification, command signifies the command to be executed when any of the processes configured for monitoring, stop. Such a format best suits situations where only a single process has been configured for monitoring, or, a single command is capable of starting all the configured processes. For example, assume that the PROCESS parameter has been configured to monitor IISWebSrv:*inetinfo*. Since only one process requires monitoring, the first format need not be used for configuring the USERPARAMS. Therefore, simplify specify the command, "net start World Wide Web Publishing Service".
- The USERPARAMS specification should be placed within double quotes if this value includes one or more blank spaces (eg.,"Apache:/opt/egurkha/bin/apachectl start").
- Note that if a processName configured in the PROCESS parameter does not have a corresponding entry in USERPARAMS (as discussed in format 1), then the auto- correction capability will not be enabled for these processes.
- CORRECTIVESCRIPT - Specify none in the CORRECTIVESCRIPT text box to use the default auto-correction script. Administrators can build new auto-correction capabilities to address probable issues with other tests, by writing their own corrective scripts. To know how to create custom auto-correction scripts, refer to the eG User Manual.
wide - This parameter is valid on Solaris and Windows systems only.
On Solaris systems (before v11), if the value of the wide parameter is Yes, the eG agent will use usr/ucb/ps instead of /usr/bin/ps to search for processes executing on the host. In Solaris 11, the eG agent uses the /usr/bin/ps auxwww command to perform the process search. The /usr/ucb/ps and the /usr/bin/ps auxwww commands provide a long output (> 80 characters), whereas /usr/bin/ps only outputs the first 80 characters of the process path and its arguments. However, some Solaris systems are configured with tightened security, which prevents the usr/ucb/psand/or the /usr/bin/ps auxwwwcommand to be executed by any and every user to the system - in other words, only pre-designated users will be allowed to execute this command. The sudo (superuser do) utility (see http://www.gratisoft.us/sudo/) can be used to allow designated users to execute this command. If your system uses sudo to restrict access to the commands that return a long output, then set wide to Yes and then specify the value sudo for the keonizedservercmd parameter. This will ensure that not only does the agent use the /usr/ucb/ps and/or the /usr/bin/ps auxwww command (as the case may be) to monitor processes (like it would do if the wide parameter were set to be Yes), but it would also use sudo to execute this command.
If the Processes test on Solaris 11 fails, then do the following:
- Check whether the wide parameter is set to Yes.
- If so, then make sure that the keonizedservercmd parameter is set to sudo.
If the test still fails, then look for the following error in the error_log file (that resides in the /opt/egurkha/agent/logs directory) on the eG agent host:
ERROR ProcessTest: ProcessTest failed to execute [sudo: pam_authenticate: Conversation failure]
The aforesaid error occurs if the sudo command prompts for a password at runtime. If you find such an error in the error_log file, then, open the sudoers file on the target host and append an entry of the following format to it:
For instance, if eguser is the eG install user, then your entry will be: Defaults:eguser !authenticate
This entry will make sure that you are no longer prompted for a password.
Save the file and restart the eG agent.
On Windows environments, by default, the eG agent uses perfmon to search for the processes that match the configured patterns. Accordingly, the wide parameter is set to false by default. Typically, a process definition in Windows includes the full path to the process, the process name, and process arguments (if any). Perfmon however scans the system only for process names that match the configured patterns – in other words, the process path and arguments are ignored by perfmon. This implies that if multiple processes on a Windows host have the same name as specified against processpattern, then perfmon will only be able to report the overall resource usage across all these processes; it will not provide any pointers to the exact process that is eroding the host’s resources. To understand this better, consider the following example. Typically, Windows represents any Java application executing on it as java.exe. Say, two Java applications are executing on a Windows host, but from different locations.
If java.exe has been configured for monitoring, then by default, perfmon will report the availability and average resource usage of both the Java applications executing on the host. If say, one Java application goes down, then perfmon will not be able to indicate accurately which of the two Java applications is currently inaccessible. Therefore, to enable administrators to easily differentiate between processes with the same name, and to accurately determine which process is unavailable or resource-hungry, the eG agent should be configured to perform its process searches based on the process path and/or process arguments, and not just on the process name – in other words, the eG agent should be configured not to use perfmon.
To achieve this, first, set the wide parameter to Yes This will instruct the eG agent to not use perfmon to search for the configured process patterns. Once this is done, then, you can proceed to configure a processpattern that includes the process arguments and/or the process path, in addition to the process name. For instance, if both the Remote Access Connection Manager service and the Terminal Services service on a Windows host, which share the same name – svchost - are to be monitored as two different processes, then your processpattern specification should be as follows:
Terminal:C:\WINDOWS\System32\svchost -k DcomLaunch,Remote:C:\WINDOWS\system32\svchost.exe -k netsvcs
You can also use wildcard characters, wherever required. For instance, in the above case, your processpattern can also be:
Terminal:*svchost -k DcomLaunch,Remote:*svchost.exe -k netsvcs
Similarly, to distinctly monitor two processes having the same name, but operating from different locations, your specification can be:
- Before including process paths and/or arguments in your processpattern configuration, make sure that the wide parameter is set to Yes. If not, the test will not work.
- If your processpattern configuration includes a process path that refers to the Program Files directory, then make sure that you do not a include a ~ (tilde) while specifying this directory name. For instance, your processpattern specification should not be say, Adobe:C:\Progra~1\Adobe\AcroRd32.exe.
- keonizedservercmd - On Solaris hosts, this test takes an additional KEONizedserverCmD parameter. Keon is a security mechanism that can be used with a multitude of operating systems to provide a centralized base for user account and password management, user access and inactivity control, system integrity checking, and auditing. If the Keon security model is in use on the Solaris host being monitored, then this test may require special user privileges for executing the operating system commands. In such a case, specify the exact command that the test is permitted to execute, in the KEONizedserverCmD text box. For example, if the keon command to be executed by the test is sudo, specify sudo in the KEONizedserverCMD text box. Alternatively, you can even specify the full path to the sudo command in the KEONIZEDSERVERCMD text box. On the other hand, if a Keon security model is not in place, then set the KEONIZEDSERVERCMD parameter to none.
useps - This flag is applicable only for AIX LPARs. By default, on AIX LPARs, this test uses the tprof command to compute CPU usage of the processes on the LPARs. Accordingly, the useps flag is set to No by default. On some AIX LPARs however, the tprof command may not function properly (this is an AIX issue). While monitoring such AIX LPARs therefore, you can configure the test to use the ps command instead for metrics collection. To do so, set the useps flag to Yes.
Alternatively, you can set the AIXusePS flag in the [AGENT_SETTINGS] section of the eg_tests.ini file (in the <EG_INSTALL_DIR>\manager\config directory) to yes (default: no) to enable the eG agent to use the ps command for CPU usage computations on AIX LPARs. If this global flag and the useps flag for a specific component are both set to no, then the test will use the default tprof command to compute CPU usage of processes executing on AIX LPARs. If either of these flags is set to yes, then the ps command will perform the CPU usage computations for such processes.
In some high-security environments, the tprof command may require some special privileges to execute on an AIX LPAR (eg., sudo may need to be used to run tprof). In such cases, you can prefix the tprof command with another command (like sudo) or the full path to a script that grants the required privileges to tprof. To achieve this, edit the eg_tests.ini file (in the <EG_INSTALL_DIR>\manager\config directory), and provide the prefix of your choice against the AixTprofPrefix parameter in the [AGENT_SETTINGS] section. Finally, save the file. For instance, if you set the AixTprofPrefix parameter to sudo, then the eG agent will call the tprof command as sudo tprof.
- ISPASSIVE – If the value chosen is Yes, then the server under consideration is a passive server in a cluster. No alerts will be generated if the server is not running. Measures will be reported as “Not applicable’ by the agent if the server is not up.