Disk Performance Test

Disks form the basic storage device in the NetApp storage systems. ATA disks, Fibre Channel disks, SCSI disks, SAS disks or SATA disks are used, depending on the storage system model.

Data ONTAP assigns and makes use of four different disk categories to support data storage, parity protection, and disk replacement. The disk category can be one of the following types:

  • Data disk - Holds data stored on behalf of clients within RAID groups (and any system management data)
  • Global hot spare disk - Does not hold usable data, but is available to be added to a RAID group in an aggregate. Any functioning disk that is not assigned to an aggregate functions acts as a hot spare disk.
  • Parity disk - Stores information required for data reconstruction within RAID groups.
  • Double-parity disk - Stores double-parity information within RAID groups, if RAID-DP is used.

Administrators should closely monitor the  level of I/O activity of each of these disks, so that they can proactively detect an I/O latency and receive early warnings of inconsistencies in load-balancing across disks. The Disk Performance test aids administrators in this endeavor. This test auto-discovers the disks used by the NetApp Cluster and reports how well every disk processes the I/O requests. This way, potential I/O latencies can be isolated, and slow disks can be identified. In the process, the test turns the spotlight on irregularities in load-balancing.

Target of the test : A NetApp Cluster

Agent deploying the test : An external/remote agent

Outputs of the test : One set of results for each aggregate on the NetApp Cluster being monitored.

Configurable parameters for the test
Parameters Description

Test Period

How often should the test be executed.

Host

The IP address of the storage controller cluster.

Port

Specify the port at which the specified host listens in the Port text box. By default, this is NULL.

User

Here, specify the name of the user who possesses the readonly role. If such a user does not pre-exist, then, you can create a special user for this purpose using the steps detailed in Creating a New User with the Role Required for Monitoring the NetApp Cluster.

Password

Specify the password that corresponds to the above-mentioned User.

Confirm Password

Confirm the Password by retyping it here.

Authentication Mechanism

In order to collect metrics from the NetApp Cluster, the eG agent connects to the ONTAP management APIs over HTTP or HTTPS. By default, this connection is authenticated using the LOGIN_PASSWORD authentication mechanism. This is why, LOGIN_PASSWORD is displayed as the default Authentication Mechanism.

Use SSL

Set the Use SSL flag to Yes, if SSL (Secured Socket Layer) is to be used to connect to the NetApp Unified Storage System, and No if it is not.

API Report

By default, in most environments, NetApp Cluster listens on port 80 (if not SSL-enabled) or on port 443 (if SSL-enabled) only. This implies that while monitoring the NetApp Cluster, the eG agent, by default, connects to port 80 or 443, depending upon the SSL-enabled status of the NetApp Cluster - i.e., if the NetApp Cluster is not SSL-enabled (i.e., if the Use SSL flag above is set to No), then the eG agent connects to the NetApp Cluster using port 80 by default, and if the NetApp Cluster is SSL-enabled (i.e., if the Use SSL flag is set to Yes), then the agent-NetApp Cluster communication occurs via port 443 by default. Accordingly, the API Port parameter is set to default by default.

In some environments however, the default ports 80 or 443 might not apply. In such a case, against the API Port parameter, you can specify the exact port at which the NetApp Cluster in your environment listens, so that the eG agent communicates with that port for collecting metrics from the NetApp Cluster.

Exclude Aggregates

If you wish to exclude certain aggregates from the scope of monitoring, specify a list of comma-separated aggregates in this text box. By default, none will be displayed here.

Records Per Call

The eG agent by default, executes the API commands in order to query the aggregates in the target environment. In critical infrastructures spanning large number of aggregates, a single execution by the eG agent may query(or download) a sizeable amount of monitoring data, thereby adding to the cluster load. To avoid this, you can tweak the Records Per Call parameter to enable the eG agent to obtain monitoring data iteratively in chunks instead of retrieving the entire amount of monitoring data in a single go. Say for example, the eG agent is required to query 1000 aggregates, then specifying the value 100 in this text box will enable the eG agent to query 100 aggregates at a time for 10 times to obtain monitoring data from all the aggregates. By default, the value of this parameter is 10.

Timeout

Specify the duration (in seconds) beyond which the test will timeout if no response is received from the device. The default is 120 seconds.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Consistency point reads

Indicates the rate at which the read requests from the user are serviced during a Consistency Point (CP) operation in this disk.

Reads/Sec

A consistent decrease in the value of this measure could indicate that CP operations are slowing down the processing of read requests.

Consistency point read latency

Indicates the time taken for retrieving data or metadata associated with user requests during a Consistency Point operation in this disk.

Secs

 

Disk busy

Indicates the percentage of time there was atleast one outstanding request (i.e., read or write) to this disk.

Percent

A value greater than 70% is a cause of concern which indicates performance degradation of the disk.

Comparing the percentage of time that the different disks are busy, an administrator can determine whether the application load is properly balanced across the different disks.

Average IO request pending

Indicates the average number of I/O requests to this disk that were pending processing.

Number

A low value is desired for this measure. A gradual/sudden increase in the value of this measure may be due to the performance degradation of the disk, network congestion or a request on the disk that is taking too long to complete.

Average IO request queued

Indicates the average number of I/O requests that are queued but are yet to be issued to this disk.

Number

 

Total transfers

Indicates the rate at which data transfer is being initiated from this disk.

Transfers/Sec

 

User read blocks

Indicates the rate at which the blocks are read from this disk upon a user request.

Blocks/Sec

A consistent decrease in the value of this measure could indicate a bottleneck when processing read requests. Compare the value of this measure across disks to know which disks service block read requests slowly.

User reads

Indicates the rate at which the read requests from the user are serviced by this disk.

Reads/Sec

A consistent decrease in the value of this measure could indicate a bottleneck when processing read requests. Compare the value of this measure across the disks to know which disks service read requests slowly.

User read latency

Indicates the average time taken to read a block from this disk upon a user request.

Secs

 

User write blocks

Indicates the rate at which the blocks are written to this disk upon a user request.

Blocks/Sec

A consistent decrease in the value of this measure could indicate a bottleneck when processing write requests. Compare the value of this measure across disks to know which disks are servicing block write requests slowly.

User writes

Indicates the rate at which the write requests from the user are serviced in this disk.

Writes/Sec

A consistent decrease in the value of this measure could indicate a bottleneck when processing write requests. Compare the value of this measure across disks to know which disks are servicing write requests slowly.

User write latency

Indicates the average time taken to write a block to this disk upon a user request.

Secs