Failover Cluster Storage Summary Test

One of the most important aspects to plan for before configuring a fail-over cluster is storage. Sufficient storage space must be available for the use of the cluster resources at all times, so that these critical resources do not fail owing to the lack of enough free space in the cluster storage. Administrators should hence periodically track the space usage in the cluster storage, check whether cluster disks in storage are used effectively or not, determine how much free space is available in the used and unused cluster disks, and figure out whether/not the space available is sufficient to handle the current and the future workload of the cluster. To monitor space usage in the cluster storage and take informed, intelligent storage management decisions, administrators can take the help of the Failover Cluster Storage Summary test.

This test monitors the cluster storage and presents a quick summary of the space usage across the used and unused cluster disks that are part of the storage. In the process, the test reveals how much free space is available in the used and unused disks in the storage; using this metric, administrators can figure out whether/not the cluster has enough free space to meet the current and the future demands. If not, administrators can use the pointers provided by this test again to decide what needs to be done to avert resource failures - should more physical disk resources be added to the cluster to handle the current and anticipated load? should space be cleared in the used cluster disks to make room for more data? can better management of unused disks help conserve storage space?

Target of the test : A node in a Windows cluster

Agent deploying the test : An internal agent

Outputs of the test : One set of results for every cluster that has been created

Parameter Description

Test Period

How often should the test be executed

Host

The host for which the test is to be configured.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Total disk count:

Indicates the total number of disks in the cluster storage.

Number

The detailed diagnosis of this measure, if enabled, lists the disks in the cluster storage, and the current state, path, and usage of each cluster disk. This way, disks that are running out of space can be isolated, so that efforts to increase the capacity of such disks can be initiated.

Unused cluster disks:

Indicates the number of cluster disks that are not currently used by any cluster resource (i.e., service/application).

Number

If the number of Unused cluster disks is more than the number of Used disks in cluster, it could indicate over-utilization of a few disks. In such a situation, compare the value of the Percentage of space free in used cluster disks measure with that of the Percentage of space free in unused cluster disks measure. If this comparison reveals that the used disks have very little free space as opposed to unused disks, it is a clear indicator that the storage resources have not been properly managed. You may want to consider reducing the load on some of the used disks by assigning the unused disks to services/applications that generate more data and hence consume more space.

To know which disks in the cluster storage are currently not used, use the detailed diagnosis of the Unused cluster disks measure.

To know which disks in the cluster storage are in use currently, take the help of the detailed diagnosis of the Used disks in cluster measure.

Used disks in cluster:

Indicates the number of cluster disks that are currently used by a cluster resource.

Number

Total capacity of used cluster disks:

Indicates the total capacity of all the used disks in the cluster.

MB

 

Capacity of unused cluster disks:

Indicates the total capacity of all unused disks in cluster.

MB

 

Total free space in used cluster disks:

Indicates the total amount of space in the used cluster disks that is currently available for use.

MB

 

Free space in unused cluster disks:

Indicates the total amount of space in the unused cluster disks that is currently available for use.

MB

 

Percentage of space free in used cluster disks:


Indicates the percentage of space that is free in used cluster disks.

Percent

For optimal cluster performance, the value of both these measures should be high. If both are low, then it indicates that the cluster is critically low on space; if the situation persists, or worse, aggravates, the resources clustered will fail! To prevent this, you can clear space on both the used and unused disks. If many disks are unused, you can even map data-intensive services/applications with these disks, so that the load on used disks is reduced. You may also want to consider adding more physical disk resources to the cluster to increase its total storage capacity.

Percentage of space free in unused cluster disks:

Indicates the percentage of space that is free in unused cluster disks.

Percent

Disks with online status :

Indicates the number of cluster disks that are currently online.

Number

 

Disks with offline status :

Indicates the number of cluster disks that are currently offline.

Number

 

Disks with failed status :

Indicates the number of cluster disks that are currently failed.

Number

 

Disks with pending status :

Indicates the number of cluster disks that are currently in pending state.

Number

 

The detailed diagnosis of the Total disks count measure, if enabled, lists the disks in the cluster storage, and the current state, path, and usage of each cluster disk. This way, disks that are running out of space can be isolated, so that efforts to increase the capacity of such disks can be initiated.

Figure 1 : The detailed diagnosis of the Total disk count measure

To know which disks in the cluster storage are in use currently, take the help of the detailed diagnosis of the Used disks in cluster measure.

Figure 2 : The detailed diagnosis of the Used disks in cluster measure