Processes Test

Application processes can be identified based on specific regular expression patterns. For example, web server processes can be identified by the pattern *httpd*, while DNS server processes can be specified by the pattern *named* where * denotes zero or more characters. For each such pattern, the process test reports a variety of CPU and memory statistics.

Target of the test : Any application server

Agent deploying the test : An internal agent

Outputs of the test : One set of results per process pattern specified

Parameter Description

Test Period

How often should the test be executed

Host

The host for which the test is to be configured.

Port

The port at which the specified host listens

Process

In the Process text box, enter a comma separated list of names:pattern pairs which identify the process(es) associated with the server being considered. processName is a string that will be used for display purposes only. processPattern is an expression of the form - *expr* or expr or *expr or expr* or *expr1*expr2*... or expr1*expr2, etc. A leading '*' signifies any number of leading characters, while a trailing '*' signifies any number of trailing characters. The pattern(s) used vary from one application to another and must be configured per application. For example, for an iPlanet application server (Nas_server), there are three processes named kcs, kjs, and kxs associated with the application server. For this server type, in the Process text box, enter "kcsProcess:*kcs*, kjsProcess:*kjs*, kxsProcess:*kxs*, where * denotes zero or more characters. Other special characters such as slashes (\) can also be used while defining the process pattern. For example, if a server’s root directory is /home/egurkha/apache and the server executable named httpd exists in the bin directory, then, the process pattern is “*/home/egurkha/apache/bin/httpd*”.

Note:

The process parameter supports process patterns containing the ~ character.

To determine the process pattern to use for your application, on Windows environments, look for the process name(s) in the Task Manager -> Processes selection. To determine the process pattern to use on Unix environments, use the ps command (e.g., the command "ps -e -o pid,args" can be used to determine the processes running on the target system; from this, choose the processes of interest to you.)

Also, while monitoring processes on Windows, if the wide parameter of this test is set to true, then your process patterns can include the full path to the process and/or the arguments supported by the process. For instance, your processpattern specification can be as follows:

Terminal:C:\WINDOWS\System32\svchost -k DcomLaunch,Remote:C:\WINDOWS\system32\svchost.exe -k netsvcs

To save the time and effort involved in such manual process specification, eG Enterprise offers an easy-to-use auto-configure option in the form of a View/Configure button that is available next to the PROCESS text box. Refer to Processes Test to know how to use the auto-configure option.

Ignore case

This parameter is applicable to Unix environments alone. By default, this parameter is set to Yes, indicating that the test will monitor the process names/patterns configured against the process parameter in a case-insensitive manner. In other words, the test will report the count and resource usage of all processes that match the configured process name/pattern, even if their cases do not match. For instance, if the process parameter is configured with Apache:*apache*, then the test will monitor the process named apache and the one named APACHE by default. If you, on the other hand, want process monitoring to be performed in a case-sensitive manner, then set this flag to No

User

By default, this parameter has a value "none"; this means that the test monitors all processes that match the configured patterns, regardless of the user executing them. If you want the test to monitor the processes for specific users alone, then, on Unix platforms, specify a comma-separated list of users to be monitored in the user text box. For instance: john,elvis,sydney

While monitoring Windows hosts on the other hand, your user configuration should be a comma-separated list of "domain name-user name" pairs, where every pair is expressed in the following format: Domainname\Username. For example, to monitor the processes of user john and elviswho belong to domain mas, your user specification should be: mas\john,mas\elvis. Also, on a Windows host, you will find system processes running on the following user accounts: SYSTEM, LOCAL SERVICE, and NETWORK SERVICE. While configuring these user accounts, make sure the Domainame is always NT AUTHORITY. In this case therefore, your user specification will be: NT AUTHORITY\SYSTEM,NT AUTHORITY\LOCAL SERVICE,NT AUTHORITY\NETWORK SERVICE.

If multiple processes are configured for monitoring and multiple users are also configured, then the test will check whether the first process is run by the first user, the second process by the second user, and so on. For instance, if the processes configured are java:java.exe,apache:*httpd* and the users configured are john,elvis, then the test will check whether user john is running the process java, and user elvis is running the process apache. Similarly, if multiple processes are configured, but a single user alone is configured, then the test will check whether the specified user runs each of the configured processes. However, if you want to check whether a single process, say java.exe, is run by multiple users - say, james and jane - then, you have to do the following:

  • Your user specification should be: james,jane (if the target host is a Unix host), or <Domainname>\james,<Domainname>\jane (if the target host is a Windows host)

  • Your process configuration should be: Process1:java.exe,Process2:java.exe. The number of processes in this case should match the number of users.

  • Such a configuration will ensure that the test checks for the java.exe process for both the users, james and jane.  

Correct

Increased uptime and lower mean time to repair are critical to ensuring that IT infrastructures deliver a high quality of service to users. Towards this end, the eG Enterprise embeds an optional auto-correction capability that enables eG agents to automatically correct problems in the environment, as soon as they occur. With this capability, as and when an abnormal situation is detected, an eG agent can initiate corrective actions automatically to resolve the problem. Automatic correction without the need for manual intervention by IT operations staff reduces service downtime and improves operational efficiency. By default, the auto-correction capability is available in the eG Enterprise for the Processes running measure of Processes test, and the Service availability measure of WindowsServices test. The eG Enterprise includes a default auto-correction script for Processes test.

When a process that has been configured for monitoring stops, this script automatically executes and starts the process. To enable the auto-correction capability for the Processes test, first, select the TRUE option against the CORRECT parameter in this page (by default, FALSE will be selected here).

ALARMTYPE

Upon selecting the true option, three new parameters, namely, ALARMTYPE, USERPARAMS, and CORRECTIVESCRIPT will appear. You can set the corrective script to execute when a specific type of alarm is generated, by selecting an option from the ALARMTYPE list box. For example, if the Critical option is chosen from the ALARMTYPE list box, then the corrective script will run only when a critical alarm for the Processes test is generated. Similarly, if the Critical/Major option is chosen, then the corrective script will execute only when the eG Enterprise system generates critical or major alarms for the Processes test. In order to ensure that the corrective script executes regardless of the alarm type, select the Critical/Major/Minor option.

USERPARAMS

The user-defined parameters that are to be passed to the corrective script are specified in the USERPARAMS text box. One of the following formats can be applied to the USERPARAMS specification:

exec@processName:command: In this specification, processName is the display name of the process pattern specified against the PROCESS parameter, and command is the command to be executed by the default script when the process(es) represented by the processName stops. For example, assume that the PROCESS parameter of Processes test has been configured in the following manner: Apache:*/opt/egurkha/manager/apache/bin/httpd*,Tomcat:*java*tomcat*, where Apache and Tomcat are the processNames or display names of the configured patterns. If auto-correction is enabled for these processes, then the USERPARAMS specification can be as follows:

exec@Apache:/opt/egurkha/manager/apache/bin/apachectl start,Tomcat: /opt/tomcat/bin/catalina.sh start

This indicates that if the processes configured under the processName "Apache" stop (i.e. */opt/egurkha/manager/apache/bin/httpd*), then the script will automatically execute the command "/opt/egurkha/manager/apache/bin/apachectl start" to start the processes. Similarly, if the "Tomcat" processes (i.e. *java*tomcat*) stop, the script will execute the command "/opt/tomcat/bin/catalina.sh start" to start the processes.

command: In this specification, command signifies the command to be executed when any of the processes configured for monitoring, stop. Such a format best suits situations where only a single process has been configured for monitoring, or, a single command is capable of starting all the configured processes. For example, assume that the PROCESS parameter has been configured to monitor IISWebSrv:*inetinfo*. Since only one process requires monitoring, the first format need not be used for configuring the USERPARAMS. Therefore, simplify specify the command, "net start World Wide Web Publishing Service".

Note:

  • The USERPARAMS specification should be placed within double quotes if this value includes one or more blank spaces (eg.,"Apache:/opt/egurkha/bin/apachectl start").

  • Note that if a processName configured in the PROCESS parameter does not have a corresponding entry in USERPARAMS (as discussed in format 1), then the auto- correction capability will not be enabled for these processes.

Corrective Script

Specify none in the CORRECTIVESCRIPT text box to use the default auto-correction script. Administrators can build new auto-correction capabilities to address probable issues with other tests, by writing their own corrective scripts. To know how to create custom auto-correction scripts, refer to the eG User Manual.

Wide

This parameter is valid on Solaris, Windows, and Linux systems only.

On Solaris systems (before v11), if the value of the wide parameter is Yes, the eG agent will use usr/ucb/ps instead of /usr/bin/ps to search for processes executing on the host. In Solaris 11, the eG agent uses the /usr/bin/ps auxwww command to perform the process search. The /usr/ucb/ps and the /usr/bin/ps auxwww commands provide a long output (> 80 characters), whereas /usr/bin/ps only outputs the first 80 characters of the process path and its arguments. However, some Solaris systems are configured with tightened security, which prevents the usr/ucb/ps and/or the /usr/bin/ps auxwwwcommand to be executed by any and every user to the system  - in other words, only pre-designated users will be allowed to execute this command. The sudo (superuser do) utility (see http://www.gratisoft.us/sudo/) can be used to allow designated users to execute this command. If your system uses sudo to restrict access to the commands that return a long output, then set wide to Yes and then specify the value sudo for the keonizedservercmdparameter. This will ensure that not only does the agent use the /usr/ucb/ps and/or the /usr/bin/ps auxwww command (as the case may be) to monitor processes (like it would do if the wide parameter were set to be Yes), but it would also use sudo to execute this command.

Note:

If the Processes test on Solaris 11 fails, then do the following:

  • Check whether the wide parameter is set to Yes.

  • If so, then make sure that the keonizedservercmd parameter is set to sudo.

  • If the test still fails, then look for the following error in the error_log file (that resides in the /opt/egurkha/agent/logs directory) on the eG agent host:

    ERROR ProcessTest: ProcessTest failed to execute [sudo: pam_authenticate: Conversation failure]

  • The aforesaid error occurs if the sudo command prompts for a password at runtime. If you find such an error in the error_log file, then, open the sudoers file on the target host and append an entry of the following format to it:

    Defaults:<eG_Install_Username> !authenticate

    For instance, if eguser is the eG install user, then your entry will be: Defaults:eguser !authenticate

    This entry will make sure that you are no longer prompted for a password.

  • Save the file and restart the eG agent.

On Windows environments, by default, the eG agent uses perfmon to search for the processes that match the configured patterns. Accordingly, the wide parameter is set to false by default. Typically, a process definition in Windows includes the full path to the process, the process name, and process arguments (if any). Perfmon however scans the system only for process names that match the configured patterns – in other words, the process path and arguments are ignored by perfmon. This implies that if multiple processes on a Windows host have the same name as specified against processpattern, then perfmon will only be able to report the overall resource usage across all these processes; it will not provide any pointers to the exact process that is eroding the host’s resources. To understand this better, consider the following example. Typically, Windows represents any Java application executing on it as java.exe. Say, two Java applications are executing on a Windows host, but from different locations.

If java.exe has been configured for monitoring, then by default, perfmon will report the availability and average resource usage of both the Java applications executing on the host. If say, one Java application goes down, then perfmon will not be able to indicate accurately which of the two Java applications is currently inaccessible. Therefore, to enable administrators to easily differentiate between processes with the same name, and to accurately determine which process is unavailable or resource-hungry, the eG agent should be configured to perform its process searches based on the process path and/or process arguments, and not just on the process name – in other words, the eG agent should be configured not to use perfmon.

To achieve this, first, set the wide parameter to Yes. This will instruct the eG agent to not use perfmon to search for the configured process patterns. Once this is done, then, you can proceed to configure a processpattern that includes the process arguments and/or the process path, in addition to the process name. For instance, if both the Remote Access Connection Manager service and the Terminal Services service on a Windows host, which share  the same name – svchost - are to be monitored as two different processes, then your processpattern specification should be as follows:

Terminal:C:\WINDOWS\System32\svchost -k DcomLaunch,Remote:C:\WINDOWS\system32\svchost.exe -k netsvcs 

You can also use wildcard characters, wherever required. For instance, in the above case, your processpattern can also be:

Terminal:*svchost -k DcomLaunch,Remote:*svchost.exe -k netsvcs

Similarly, to distinctly monitor two processes having the same name, but operating from different locations, your specification can be:

JavaC:c:\javaapp\java.exe,JavaD:d:\app\java.exe

Note:

  • Before including process paths and/or arguments in your processpattern configuration, make sure that the wide parameter is set to Yes. If not, the test will not work.

  • If your processpattern configuration includes a process path that refers to the Program Files directory, then make sure that you do not a include a ~ (tilde) while specifying this directory name. For instance, your processpattern specification should not be say, Adobe:C:\Progra~1\Adobe\AcroRd32.exe.

Keonized Server cmd

On Solaris hosts, this test takes an additional KEONIZEDSERVERCMD parameter. Keon is a security mechanism that can be used with a multitude of operating systems to provide a centralized base for user account and password management, user access and inactivity control, system integrity checking, and auditing. If the Keon security model is in use on the Solaris host being monitored, then this test may require special user privileges for executing the operating system commands. In such a case, specify the exact command that the test is permitted to execute, in the KEONIZEDSERVERCMD text box. For example, if the keon command to be executed by the test is sudo, specify sudo in the KEONIZEDSERVERCMD text box. Alternatively, you can even specify the full path to the sudo command in the KEONIZEDSERVERCMD text box. On the other hand, if a Keon security model is not in place, then set the KEONIZEDSERVERCMD parameter to none.

Useglance

This flag applies only to HP-UX systems. HP GlancePlus/UX is Hewlett-Packards’s online performance monitoring and diagnostic utility for HP-UX based computers. There are two user interfaces of GlancePlus/UX -- Glance is character-based, and gpm is motif-based. Each contains graphical and tabular displays that depict how primary system resources are being utilized. In environments where Glance is run, the eG agent can be configured to integrate with Glance to pull out the process status and resource usage metrics from the HP-UX systems that are being monitored. By default, this integration is disabled. This is why the useglance flag is set to No by default. You can enable the integration by setting the flag to Yes. If this is done, then the test polls the Glance interface of HP GlancePlus/UX utility to pull out the desired metrics.

Useps

This flag is applicable only for AIX LPARs. By default, on AIX LPARs, this test uses the tprof command to compute CPU usage of the processes on the LPARs. Accordingly, the useps flag is set to No by default. On some AIX LPARs however, the tprof command may not function properly (this is an AIX issue). While monitoring such AIX LPARs therefore, you can configure the test to use the ps command instead for metrics collection. To do so, set the useps flag to Yes.

Note:

Alternatively, you can set the AIXusePS flag in the [AGENT_SETTINGS] section of the eg_tests.ini file (in the <EG_INSTALL_DIR>\manager\config directory) to yes (default: no) to enable the eG agent to use the ps command for CPU usage computations on AIX LPARs. If this global flag and the useps flag for a specific component are both set to no, then the test will use the default tprof command to compute CPU usage of processes executing on AIX LPARs. If either of these flags is set to yes, then the ps command will perform the CPU usage computations for such processes.  

In some high-security environments, the tprof command may require some special privileges to execute on an AIX LPAR (eg., sudo may need to be used to run tprof). In such cases, you can prefix the tprof command with another command (like sudo) or the full path to a script that grants the required privileges to tprof. To achieve this, edit the eg_tests.ini file  (in the <EG_INSTALL_DIR>\manager\config directory), and provide the prefix of your choice against the AixTprofPrefix parameter in the [AGENT_SETTINGS] section. Finally, save the file.  For instance, if you set the AixTprofPrefix parameter to sudo, then the eG agent will call the tprof command as sudo tprof

ISPASSIVE

If the value chosen is Yes, then the server under consideration is a passive server in a cluster. No alerts will be generated if the server is not running. Measures will be reported as "Not applicable" by the agent if the server is not up.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Processes running

Number of instances of a process(es) currently executing on a host.

Number

This value indicates if too many or too few processes corresponding to an application are executing on the host.

CPU utilization

Percentage of CPU used by executing process(es) corresponding to the pattern specified.

Percent

A very high value could indicate that processes corresponding to the specified pattern are consuming excessive CPU resources.

Memory utilization

For one or more processes corresponding to a specified set of patterns, this value represents the ratio of the resident set size of the processes to the physical memory of the host system, expressed as a percentage.

Percent

A sudden increase in memory utilization for a process(es) may be indicative of memory leaks in the application.