Disk State Test

This test reports the current state of each disk in the NetApp Cluster. Using this test, you can easily figure out the the disks that currently offline, the disks that are currently in the Replacing/Reconstructing/Failed states.

Target of the test : A NetApp Cluster

Agent deploying the test : An external/remote agent

Outputs of the test : One set of results for each disk on the NetApp Cluster being monitored.

Configurable parameters for the test
Parameters Description

Test Period

How often should the test be executed.

Host

The IP address of the storage controller cluster.

Port

Specify the port at which the specified host listens in the Port text box. By default, this is NULL.

User

Here, specify the name of the user who possesses the readonly role. If such a user does not pre-exist, then, you can create a special user for this purpose using the steps detailed in Creating a New User with the Role Required for Monitoring the NetApp Cluster.

Password

Specify the password that corresponds to the above-mentioned User.

Confirm Password

Confirm the Password by retyping it here.

Authentication Mechanism

In order to collect metrics from the NetApp Cluster, the eG agent connects to the ONTAP management APIs over HTTP or HTTPS. By default, this connection is authenticated using the LOGIN_PASSWORD authentication mechanism. This is why, LOGIN_PASSWORD is displayed as the default Authentication Mechanism.

Use SSL

Set the Use SSL flag to Yes, if SSL (Secured Socket Layer) is to be used to connect to the NetApp Unified Storage System, and No if it is not.

API Report

By default, in most environments, NetApp Cluster listens on port 80 (if not SSL-enabled) or on port 443 (if SSL-enabled) only. This implies that while monitoring the NetApp Cluster, the eG agent, by default, connects to port 80 or 443, depending upon the SSL-enabled status of the NetApp Cluster - i.e., if the NetApp Cluster is not SSL-enabled (i.e., if the Use SSL flag above is set to No), then the eG agent connects to the NetApp Cluster using port 80 by default, and if the NetApp Cluster is SSL-enabled (i.e., if the Use SSL flag is set to Yes), then the agent-NetApp Cluster communication occurs via port 443 by default. Accordingly, the API Port parameter is set to default by default.

In some environments however, the default ports 80 or 443 might not apply. In such a case, against the API Port parameter, you can specify the exact port at which the NetApp Cluster in your environment listens, so that the eG agent communicates with that port for collecting metrics from the NetApp Cluster.

Exclude Aggregates

If you wish to exclude certain aggregates from the scope of monitoring, specify a list of comma-separated aggregates in this text box. By default, none will be displayed here.

Records Per Call

The eG agent by default, executes the API commands in order to query the aggregates in the target environment. In critical infrastructures spanning large number of aggregates, a single execution by the eG agent may query(or download) a sizeable amount of monitoring data, thereby adding to the cluster load. To avoid this, you can tweak the Records Per Call parameter to enable the eG agent to obtain monitoring data iteratively in chunks instead of retrieving the entire amount of monitoring data in a single go. Say for example, the eG agent is required to query 1000 aggregates, then specifying the value 100 in this text box will enable the eG agent to query 100 aggregates at a time for 10 times to obtain monitoring data from all the aggregates. By default, the value of this parameter is 10.

Timeout

Specify the duration (in seconds) beyond which the test will timeout if no response is received from the device. The default is 120 seconds.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Is failed?

Indicates whether/not the status of this disk is Failed.

 

This measure reports a value Yes if the status of the disk is Failed and No if otherwise.

The values that this measure can report and their corresponding numeric values have been listed in the table below.

Measure Value Numeric Value
Yes 1
No 0

Note:

By default, this measure reports the above-mentioned Measure Values while indicating whether the status of this disk is Failed or not. However, in the graph of this measure, Measure Values will be represented using the corresponding numeric equivalents i.e., 0 or 1.

Is offline?

Indicates whether/not the status of this disk is Offline.

 

This measure reports a value Yes if the disk is Offline and No if otherwise.

The values that this measure can report and their corresponding numeric values have been listed in the table below.

Measure Value Numeric Value
Yes 1
No 0

Note:

By default, this measure reports the above-mentioned Measure Values while indicating whether the status of this disk is Offline or not. However, in the graph of this measure, Measure Values will be represented using the corresponding numeric equivalents i.e., 0 or 1.

Is prefailed?

Indicates whether/not the status of this disk is Prefailed.

 

The disks that are manually failed due to excessive error logging are termed as Prefailed disks. The contents of these disks are copied into suitable replacement disks i.e., the spare disks available in the storage system.

This measure reports a value Yes if the status of the disk is Prefailed and No if otherwise.

The values that this measure can report and their corresponding numeric values have been listed in the table below.

Measure Value Numeric Value
Yes 1
No 0

Note:

By default, this measure reports the above-mentioned Measure Values while indicating whether the status of the disk is Prefailed or not. However, in the graph of this measure, Measure Values will be represented using the corresponding numeric equivalents i.e., 0 or 1.

Is reconstructing?

Indicates whether/not the status of this disk is Reconstructing.

 

This measure reports a value Yes if the status of the disk is Reconstructing and No if otherwise.

The values that this measure can report and their corresponding numeric values have been listed in the table below.

Measure Value Numeric Value
Yes 1
No 0

Note:

By default, this measure reports the above-mentioned Measure Values while indicating whether the status of this disk is Reconstructing or not. However, in the graph of this measure, Measure Values will be represented using the corresponding numeric equivalents i.e., 0 or 1.

Is replacing?

Indicates whether/not the status of the disk is Replacing.

 

Mismatched disks that are part of an aggregate can be replaced with a more suitable spare disk without disrupting the data service. This process uses the Rapid RAID Recovery process to copy the data from the disk being replaced to a specified spare disk. Frequently replacing the disks will lead to the system degradation. Therefore, the frequent replacement of the disks needs to be avoided by proper initial configuration.

This measure reports a value Yes if the status of the disk is Replacing and No if otherwise.

The values that this measure can report and their corresponding numeric values have been listed in the table below.

Measure Value Numeric Value
Yes 1
No 0

Note:

By default, this measure reports the above-mentioned Measure Values while indicating whether this disk is a replacing disk or not. However, in the graph of this measure, Measure Values will be represented using the corresponding numeric equivalents i.e., 0 or 1.