Volume Performance Test

Volumes are provisioned on an aggregate on a cluster node, and the combination of all the volumes constitutes the entire namespace or resource pool for LUNs. Volumes contain file systems that hold user data that is accessible using one or more of the access protocols supported by clustered Data ONTAP, including NFS, CIFS, HTTP, FTP, FC, and iSCSI.

For users to be able to read from/write data into volumes quickly, the I/O requests should be processed rapidly by the volumes. Slowdowns in data retrieval can be attributed to I/O processing bottlenecks. In the event of such slowdowns, administrators need to swiftly isolate the following:

  • Which volumes are over-utilized?
  • Which volumes are overloaded?
  • Which volumes are experiencing serious latencies?
  • When were these latencies observed most frequently – while reading or writing?
  • What  type of operations registered the maximum latency – CIFS, NFS, or iSCSI?

The Volume Performance test provides accurate answers to these questions. With the help of these answers, you can quickly diagnose the root-cause of slowdowns when reading from/writing into a volume.

Target of the test : A NetApp Cluster

Agent deploying the test : An external/remote agent

Outputs of the test : One set of results for each volume configured on the NetApp Cluster being monitored.

Configurable parameters for the test
Parameters Description

Test Period

How often should the test be executed.

Host

The IP address of the storage controller cluster.

Port

Specify the port at which the specified host listens in the Port text box. By default, this is NULL.

User

Here, specify the name of the user who possesses the readonly role. If such a user does not pre-exist, then, you can create a special user for this purpose using the steps detailed in Creating a New User with the Role Required for Monitoring the NetApp Cluster.

Password

Specify the password that corresponds to the above-mentioned User.

Confirm Password

Confirm the Password by retyping it here.

Authentication Mechanism

In order to collect metrics from the NetApp Cluster, the eG agent connects to the ONTAP management APIs over HTTP or HTTPS. By default, this connection is authenticated using the LOGIN_PASSWORD authentication mechanism. This is why, LOGIN_PASSWORD is displayed as the default Authentication Mechanism.

Use SSL

Set the Use SSL flag to Yes, if SSL (Secured Socket Layer) is to be used to connect to the NetApp Unified Storage System, and No if it is not.

API Report

By default, in most environments, NetApp Cluster listens on port 80 (if not SSL-enabled) or on port 443 (if SSL-enabled) only. This implies that while monitoring the NetApp Cluster, the eG agent, by default, connects to port 80 or 443, depending upon the SSL-enabled status of the NetApp Cluster - i.e., if the NetApp Cluster is not SSL-enabled (i.e., if the Use SSL flag above is set to No), then the eG agent connects to the NetApp Cluster using port 80 by default, and if the NetApp Cluster is SSL-enabled (i.e., if the Use SSL flag is set to Yes), then the agent-NetApp Cluster communication occurs via port 443 by default. Accordingly, the API Port parameter is set to default by default.

In some environments however, the default ports 80 or 443 might not apply. In such a case, against the API Port parameter, you can specify the exact port at which the NetApp Cluster in your environment listens, so that the eG agent communicates with that port for collecting metrics from the NetApp Cluster.

Exclude Aggregates

If you wish to exclude certain aggregates from the scope of monitoring, specify a list of comma-separated aggregates in this text box. By default, none will be displayed here.

Records Per Call

The eG agent by default, executes the API commands in order to query the aggregates in the target environment. In critical infrastructures spanning large number of aggregates, a single execution by the eG agent may query(or download) a sizeable amount of monitoring data, thereby adding to the cluster load. To avoid this, you can tweak the Records Per Call parameter to enable the eG agent to obtain monitoring data iteratively in chunks instead of retrieving the entire amount of monitoring data in a single go. Say for example, the eG agent is required to query 1000 aggregates, then specifying the value 100 in this text box will enable the eG agent to query 100 aggregates at a time for 10 times to obtain monitoring data from all the aggregates. By default, the value of this parameter is 10.

Timeout

Specify the duration (in seconds) beyond which the test will timeout if no response is received from the device. The default is 120 seconds.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Total operations

Indicates the rate at which operations (including read and write) were performed on this volume.

Ops/Sec

This measure is a good indicator of how busy the volume is.

Comparing the value of this measure across volumes will enable you to quickly detect load-balancing irregularities (if any).

Write operations

Indicates the rate at which write operations were performed on this volume.

Ops/Sec

 

Read operations

Indicates the rate at which read operations were performed from this volume.

Ops/Sec

 

Average latency

Indicates the average time taken by the WAFL filesystem to process all the operations performed on this volume.

MilliSeconds

The value of this measure excludes the request processing time and the network communication time of the volume.

A high value of this measure is a cause for concern, as it indicates a processing bottleneck.

Read latency

Indicates the average time taken by the WAFL filesystem to process the read requests of this volume.

MilliSeconds

The value of these measures exclude the request processing time and the network communication time of the volume.

If the Average latency of a volume is high, then you can compare the value of these measures for that volume to know when the latency occurred – while reading or writing?

Write latency

Indicates the average time taken by the WAFL filesystem to process the write requests made to this volume.

MilliSeconds

Data read

Indicates the rate at which data bytes were read from this volume.

MB/Sec

 

Date written

Indicates the rate at which data bytes were written to this volume

MB/Sec

 

CIFS operations

Indicates the rate at which the CIFS operations were performed on this volume.

Ops/Sec

This measure is inclusive of all the CIFS operations i.e., read, write and other miscellaneous CIFS operations.

By comparing the value of this measure with that of the NFS operations and SAN operations measures for a volume, you can figure out which type of operation imposed the maximum load on that volume.

NFS operations

Indicates the rate at which the NFS operations were performed on this volume.

Ops/Sec

This measure is inclusive of all the NFS operations i.e., read, write and other miscellaneous NFS operations.

By comparing the value of this measure with that of the CIFS operations and SAN operations measures for a volume, you can figure out which type of operation imposed the maximum load on that volume.

SAN operations

Indicates the rate at which the SAN operations were performed on this volume.

Ops/Sec

This measure is inclusive of all the SAN operations i.e., read, write and other miscellaneous SAN operations.

By comparing the value of this measure with that of the CIFS operations and NFS operations measures for a volume, you can figure out which type of operation imposed the maximum load on that volume.

CIFS latency

Indicates the average time taken for performing the CIF operations (including read, write and other miscellaneous CIF operations) on this volume.

Milliseconds

The value of these measures exclude the request processing time and the network communication time of the volume.

Ideally, the value of these measures should be low. If the Avg latency of a volume is very high, then, you can compare the value of these measures for that volume to determine the reason for the latency – is it because of processing bottlenecks experienced by CIFS operations? NFS operations? Or SAN operations?

NFS latency

Indicates the average time taken for performing the NFS operations (including read, write and other miscellaneous NFS operations) on this volume.

Milliseconds

SAN latency

Indicates the average time taken for performing the block protocol operations (including read, write and other miscellaneous block protocols operations) on this volume.

Milliseconds