NetApp Volume Details Test

Volumes contain file systems that hold user data that is accessible using one or more of the access protocols supported by Data ONTAP, including NFS, CIFS, HTTP, FTP, FC, and iSCSI.

For users to be able to read from/write data into volumes quickly, adequate space must be available in the volumes and the I/O requests should be processed rapidly by the volumes. Slowdowns in data storage/retrieval can be attributed to storage space contentions experienced by one/more volumes or I/O processing bottlenecks. In the event of such slowdowns, administrators need to swiftly isolate the following:

Which volumes are over-utilized?
Which volumes are overloaded?
Which volumes are experiencing serious latencies?
When were these latencies observed most frequently – while reading or writing?
What type of operations registered the maximum latency – CIFS, NFS, or iSCSI?

The NetApp Volume Details test provides accurate answers to these questions. With the help of these answers, you can quickly diagnose the root-cause of slowdowns when reading from/writing into a volume.

Target of the test : A NetApp Unified Storage

Agent deploying the test : An external/remote agent

Outputs of the test : One set of results for each volume on the NetApp storage system being monitored.

Configurable parameters for the test
Parameters	Description
Test Period	How often should the test be executed.
Host	The host for which the test is to be configured.
Port	Specify the port at which the specified host listens in the Port text box. By default, this is NULL.
User	Here, specify the name of the user who possesses the following privileges: login-http-admin,api-aggr-check-spare-low,api-aggr-list-info,api-aggr-mediascrub-list-info,api-aggr-scrub-list-info,api-cifs-status,api-clone-list-status,api-disk-list-info,api-fcp-adapter-list-info,api-fcp-adapter-stats-list-info,api-fcp-service-status,api-file-get-file-info,api-file-read-file,api-iscsi-connection-list-info,api-iscsi-initiator-list-info,api-iscsi-service-status,api-iscsi-session-list-info,api-iscsi-stats-list-info,api-lun-config-check-alua-conflicts-info,api-lun-config-check-cfmode-info,api-lun-config-check-info,api-lun-config-check-single-image-info,api-lun-list-info,api-nfs-status,api-perf-object-get-instances-iter,api-perf-object-instance-list-info,api-quota-report-iter,api-snapshot-list-info,api-vfiler-list-info,api-volume-list-info-iter*. If such a user does not pre-exist, then, you can create a special user for this purpose using the steps detailed in Creating a New User with the Privileges Required for Monitoring the NetApp Unified Storage.
Password	Specify the password that corresponds to the above-mentioned User.
Confirm Password	Confirm the Password by retyping it here.
Authentication Mechanism	In order to collect metrics from the NetApp Unified Storage system, the eG agent connects to the ONTAP management APIs over HTTP or HTTPS. By default, this connection is authenticated using the LOGIN_PASSWORD authentication mechanism. This is why, LOGIN_PASSWORD is displayed as the default authentication mechanism.
Use SSL	Set the Use SSL flag to Yes, if SSL (Secured Socket Layer) is to be used to connect to the NetApp Unified Storage System, and No if it is not.
API Port	By default, in most environments, NetApp Unified Storage system listens on port 80 (if not SSL-enabled) or on port 443 (if SSL-enabled) only. This implies that while monitoring the NetApp Unified Storage system, the eG agent, by default, connects to port 80 or 443, depending upon the SSL-enabled status of the NetApp Unified Storage system - i.e., if the NetApp Unified Storage system is not SSL-enabled (i.e., if the Use SSL flag above is set to No), then the eG agent connects to the NetApp Unified Storage system using port 80 by default, and if the NetApp Unified Storage system is SSL-enabled (i.e., if the Use SSL flag is set to Yes), then the agent-NetApp Unified Storage system communication occurs via port 443 by default. Accordingly, the API Port parameter is set to default by default. In some environments however, the default ports 80 or 443 might not apply. In such a case, against the API Port parameter, you can specify the exact port at which the NetApp Unified Storage system in your environment listens, so that the eG agent communicates with that port for collecting metrics from the NetApp Unified Storage system.
vFilerName	A vFiler is a virtual storage system you create using MultiStore, which enables you to partition the storage and network resources of a single storage system so that it appears as multiple storage systems on the network. If the NetApp Unified Storage system is partitioned to accommodate a set of vFilers, specify the name of the vFiler that you wish to monitor in the vFilerName text box. In some environments, the NetApp Unified Storage system may not be partitioned at all. In such a case, the NetApp Unified Storage system is monitored as a single vFiler and hence the default value of none is displayed in this text box.
Timeout	Specify the duration (in seconds) beyond which the test will timeout if no response is received from the device. The default is 120 seconds.
Used Percentage Threshold	This test not only reports a set of metrics for each volume on the storage device, but also reports metrics for the following descriptors: Busy volumes, Slow volumes, and Highly utilized volumes. By default, the Highly utilized volumes descriptor will report metrics for those volumes in which over 80% of space has already been utilized. This is why, the Used Percentage Threshold is set to 80 by default. You can change this threshold by specifying a different percentage value against Used Percentage Threshold. This parameter is deprecated in v5.6.5 (and above).
Operations Threshold	This test not only reports a set of metrics for each volume on the storage device, but also reports metrics for the following descriptors: Busy volumes, Slow volumes, and Highly utilized volumes. The Operations Threshold value (in operations/sec) you set determines which volumes will be counted as Busy volumes by this test. Typically, if the rate of operations to a volume exceeds the rate specified against Operations Threshold, then the test will consider such a volume to be a Busy volume. This parameter is deprecated in v5.6.5 (and above).
Avg Latency Threshold	This test not only reports a set of metrics for each volume on the storage device, but also reports metrics for the following descriptors: Busy volumes, Slow volumes, and Highly utilized volumes. The avg latency threshold value (in milliseconds) you set determines which volumes will be counted as Slow volumes by this test. Typically, if the latency registered by a volume falls exceeds the Avg Latency Threshold you specify, then the test will consider such a volume to be a Slow volume. This parameter is deprecated in v5.6.5 (and above).
DD Frequency	Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.
Detailed Diagnosis	To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled: The eG manager license should allow the detailed diagnosis capability Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.

Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Number of volumes

Indicates the number of volumes that are currently highly utilized/slow/busy.

Number

This measure appears only for the Highly utilized, Slow and Busy volumes. In the case of Highly utilized volumes, the detailed diagnosis of this measure if enabled, lists the names of the highly utilized volumes and the percentage of space that is utilized in each volume.
In the case of Slow volumes, the detailed diagnosis of this measure if enabled, lists the names of the slow volumes and the average latency i.e., the time taken to perform read/write operations on each volume.
In the case of Busy volumes, the detailed diagnosis of this measure if enabled, lists the names of the busy volumes and the rate at which operations were performed on each volume.
With the help of the detailed diagnosis information therefore, you can quickly identify the highly utilized, slow, and busy volumes.

This measure is deprecated in v5.6.5 (and above).

State

Indicates the current state of this volume.

The values that this measure can report and their corresponding numeric equivalents are shown in the table below:

Measure Value	Numeric Value
Online	0
Creating	1
Restricted	2
Offline	3
Partial	4
Unknown	5
Failed	6

Note:

By default, this measure reports the above-mentioned Measure Values while indicating the current state of a volume. However, in the graph of this measure, states will be represented using the corresponding numeric equivalents only.

Is volume in error?

Indicates whether/not this volume is error-prone.

Generally, errors may be caused when the volume is inconsistent, unrecoverable or invalid. A volume is considered to be inconsistent if there exists known inconsistencies in the associated file system. An increase in the inconsistencies will render the volume unrecoverable. Unrecoverable volumes cannot be accessed. If mirroring has been enabled, Data ONTAP will automatically access the mirrored data of the unrecoverable volume. A volume is said to be invalid if a vol-copy or SNMPmirror initial transfer has been aborted. Such invalid volumes are generally partially created and cannot be recovered fully. Operation errors are taken into account if this volume is a Single Instance Storage (SIS) volume.

This measure reports the value Yes if a volume is error-prone and the value No if it is error-free.

The numeric values that correspond to the above-mentioned values are represented in the table below:

Measure Value	Numeric Value
Yes	1
No	0

Note:

By default, this measure reports the above-mentioned Measure Values while indicating whether/not this volume is error-prone. However, in the graph of this measure the same will be represented using the corresponding numeric equivalents only.

The detailed diagnosis capability of this measure, if enabled, lists the type of the error. In the case of an SIS operation error, the actual SIS error message will also be displayed as part of the detailed diagnosis.

This measure is applicable only to individual volumes.

Used space percentage

Indicates the percentage of space that is utilized in this volume.

Percent

Ideally, the value of this measure should be low. A high value or a consistent increase in the value of this measure is indicative of excessive space usage in a volume.