Cluster Resources Status - Linux Test

A cluster resource is a resource that is required to be highly available for the business. Cluster resources can be either moved or replicated to one or more nodes within a cluster. A resource group is a convenient way of keeping resources together and it consists of more than one resource. The resource groups need to be located together, start sequentially, and stop in the reverse order. These cluster resources make the service highly-available throughout all the nodes in the cluster. A Linux cluster can maintain multiple resource groups. A resource can be migrated from one resource group to another but it cannot exist in more than one group at a time. Often, failure of a resource group leads to the inaccessibility of the resources in that group. Moreover, if a resource is inaccessible or is stopped, then, users accessing the resource may face unexpected delays. To avoid this, it is necessary to monitor the state of the resources round the clock. The Cluster Resources Status - Linux test helps administrators in this regard!

This test auto-discovers the resources in the target cluster and for each resource, this test reports the current status. Using this test, the resources that have been stopped frequently can be identified. The detailed diagnosis of this test helps administrators to find resource group name, resource name, resource type and node name.

Target of the test : A Linux cluster

Agent deploying the test : An internal agent

Outputs of the test : One set of results for the Linux cluster being monitored.

Job Name Description

Test Period

How often should the test be executed

Host

The host for which the test is to be configured.

Port

The port at which the specified host listens. By default, this is Null.

Report by Owner Node Only

If this flag is set to Yes, then, this test will report metrics only for the owner node and not for the other nodes in the cluster. On the other hand, if the flag is set to No, then, it indicates that the test will report metrics for all the nodes in the cluster. By default, this flag is set to No.

Use SUDO

By default, this flag is set to Yes, indicating that the test uses sudo command to collect the daemon-related metrics. If this flag is set to No, then the test will not collect the metrics using sudo command.

SUDO Path

This parameter is relevant only when the Use SUDO parameter is set to Yes. By default, the SUDO Path is set to none. This implies that the sudo command is in its default location - i.e., in the /usr/bin or /usr/sbin folder of the target host. In this case, once the Use SUDO flag is set to Yes, the eG agent automatically runs the sudo command from its default location to allow access to the daemon process. However, if the sudo command is available in a different location in your environment, you will have to explicitly specify the full path to the sudo command in this text box to enable the eG agent to run the sudo command.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Resource status

Indicates the current status this resource.

 

The values that this measure can report and their corresponding numeric values are listed in the table below:

Measure Value Numeric Value
Started 100
Stopped 0

Note:

By default, this measure can report the Measure values mentioned above while indicating the current state of the resources. However, the graph of this measure is indicated using the corresponding numeric equivalents i.e., 0 or 100.

The detailed diagnosis of this measure gives the resource group name, resource name, resource class, provider, resource type, and node name.