Isilon Node Disk Throughput Test

The nodes in the storage system contain a number of disk drives that are held in drive bays. The drive bays facilitate easy removal of the failed drives and insertion of the replacement drives. If users complain about slowness of disk drives on the nodes, then administrators should first check the traffic flowing in and out of each disk drive. Tracking the traffic to the disk drives helps administrators to find out how well the read/write requests were performed on the disk drives and delays in I/O operations. This is what exactly the Node Disk Drive Throughputs test does.

This test auto-discovers the disk drives on the nodes, and reports the rate at which the data was read from/written on each disk drive, and the rate at which the read and write operations were performed on the disk drive. In the process, this test also reveals the time taken by each disk drive to perform read and write operations. This way, the test promptly alerts administrators to any abnormal increase in total traffic to the disk drives and the time delay (if any) in processing the I/O requests.

Target of the test : An EMC Isilon Storage System

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each node:disk drive combination on the target storage system being monitored.

Configurable parameters for the test
Parameters Description

Test period

How often should the test be executed

Host

The IP address of the host for which this test is to be configured.

SNMPPort

The port at which the monitored target exposes its SNMP MIB; the default is 161.

SNMPVersion

By default, the eG agent supports SNMP version 1. Accordingly, the default selection in the SNMPversion list is v1. However, if a different SNMP framework is in use in your environment, say SNMP v2 or v3, then select the corresponding option from this list.

SNMPCommunity

The SNMP community name that the test uses to communicate with the target storage system. This parameter is specific to SNMP v1 and v2 only. Therefore, if the SNMPVersion chosen is v3, then this parameter will not appear.

Username

This parameter appears only when v3 is selected as the SNMPversion. SNMP version 3 (SNMPv3) is an extensible SNMP Framework which supplements the SNMPv2 Framework, by additionally supporting message security, access control, and remote SNMP configuration capabilities. To extract performance statistics from the MIB using the highly secure SNMP v3 protocol, the eG agent has to be configured with the required access privileges – in other words, the eG agent should connect to the MIB using the credentials of a user with access permissions to be MIB. Therefore, specify the name of such a user against this parameter. 

Context

This parameter appears only when v3 is selected as the SNMPVERSION. An SNMP context is a collection of management information accessible by an SNMP entity. An item of management information may exist in more than one context and an SNMP entity potentially has access to many contexts. A context is identified by the SNMPEngineID value of the entity hosting the management information (also called a contextEngineID) and a context name that identifies the specific context (also called a contextName). If the Username provided is associated with a context name, then the eG agent will be able to poll the MIB and collect metrics only if it is configured with the context name as well. In such cases therefore, specify the context name of the Username in the Context text box.  By default, this parameter is set to none.

AuthPass

Specify the password that corresponds to the above-mentioned Username. This parameter once again appears only if the SNMPversion selected is v3.

Confirm Password

Confirm the AuthPass by retyping it here.

AuthType

This parameter too appears only if v3 is selected as the SNMPversion. From the AuthType list box, choose the authentication algorithm using which SNMP v3 converts the specified username and password into a 32-bit format to ensure security of SNMP transactions. You can choose between the following options:

  • MD5 - Message Digest Algorithm
  • SHA - Secure Hash Algorithm
  • SHA224 - Secure Hash Algorithm 224 bit
  • SHA256 - Secure Hash Algorithm 256 bit
  • SHA384 - Secure Hash Algorithm 384 bit
  • SHA512 - Secure Hash Algorithm 512 bit

EncryptFlag

This flag appears only when v3 is selected as the SNMPversion. By default, the eG agent does not encrypt SNMP requests. Accordingly, the this flag is set to No by default. To ensure that SNMP requests sent by the eG agent are encrypted, select the Yes option. 

EncryptType

If the EncryptFlag is set to Yes, then you will have to mention the encryption type by selecting an option from the EncryptType list. SNMP v3 supports the following encryption types:

  • DES - Data Encryption Standard
  • 3DES - Triple Data Encryption Standard
  • AES - Advanced Encryption Standard
  • AES128 - Advanced Encryption Standard 128 bit
  • AES192 - Advanced Encryption Standard 192 bit
  • AES256 - Advanced Encryption Standard 256 bit

EncryptPassword

Specify the encryption password here.

Confirm Password

Confirm the encryption password by retyping it here.

Timeout

Specify the duration (in seconds) within which the SNMP query executed by this test should time out in this text box. The default is 10 seconds.

Data Over TCP

By default, in an IT environment, all data transmission occurs over UDP. Some environments however, may be specifically configured to offload a fraction of the data traffic – for instance, certain types of data traffic or traffic pertaining to specific components – to other protocols like TCP, so as to prevent UDP overloads. In such environments, you can instruct the eG agent to conduct the SNMP data traffic related to the monitored target over TCP (and not UDP). For this, set this flag to Yes. By default, this flag is set to No.

EngineId

This parameter appears only when v3 is selected as the SNMPVersion. Sometimes, the test may not report metrics when AES192 or AES256 is chosen as the Encryption type. To ensure that the test report metrics consistently, administrators need to set this flag to Yes. By default, this parameter is set to No.

Use SUDO

By default, this parameter is set to No. This indicates that, by default, the eG agent will not require any special permissions to execute the commands. However, in some highly secure environments, this command cannot be executed directly as the eG agent install user is different from the root user who has the privileges to run all commands on the target storage system. In such cases, create a sudo user using the steps discussed in the Pre-requisites for Monitoring the EMC Isilon Storage System. Credentials of such a user should be specified in the Username and Password text boxes in the COMPONENTS page.

Set the Use Sudo parameter to Yes. This will enable the eG agent install user to execute the commands.

Include Nodes

By default, this parameter is set to all indicating that the eG agent reports performance metrics for all nodes in the cluster by default. Sometimes, administrators may want to monitor performance of the nodes that are very critical. In such a case, the eG agent can be configured to include only those nodes for monitoring. To achieve this, provide a comma-separated list of nodes in the Include Nodes text box. The specification could be any of the following formats:

  • the list of node numbers - 2,4,6,8,10,12,14
  • the list of node number ranges - 1-5,7-11,21-25,27-32,34-42 or
  • the combination node numbers and node number ranges - 2, 6, 8, 10-18, 24, 29, 40-48.

This way, administrators can make sure that the eG agent collects metrics only for a configured set of nodes.

Exclude Nodes

By default, this parameter is set to none indicating that the eG agent reports performance metrics for all nodes in the cluster by default. In some environments, administrators may not want to monitor some of the less-critical nodes. In such a case, the eG agent can be configured to exclude such nodes from monitoring. To achieve this, provide a comma-separated list of nodes in the Exclude Nodes text box. The specification could be any of the following formats:

  • the list of node numbers - 1,2,3,7,10,18
  • the list of node number ranges - 6-9,12-15,21-29,47-52 or
  • the combination node numbers and node number ranges - 2,6,8,10-18,24,29,40-48.

This way, administrators can make sure the eG agent stops collecting metrics for a configured set of nodes.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Data reads

Indicates the rate at which the data was read from this disk drive.

MB/sec

Comparing the value of these measures across the disk drives will clearly indicate which disk drive is the busiest in terms of the rate at which data is read and written - it could also shed light on irregularities in load balancing across the disk drives.

Data writes

Indicates the rate at which the data was written on this disk drive.

MB/sec

Disk reads

Indicates the rate at which the read operations were performed on this disk drive during the last measurement period.

Reads/sec

Ideally, the value of this measure should be high. A steady dip in this measure value could indicate a potential reading bottleneck.

Disk writes

Indicates the rate at which the write operations were performed on this disk drive during the last measurement period.

Writes/sec

Ideally, the value of this measure should be high. A steady dip in this measure value could indicate a potential writing bottleneck.

Disk I/O operations

Indicates the total number of I/O operations that were performed on this disk drive per second.

IOPS

 

Disk latency

Indicates the time taken by this disk drive to perform read and write operations.

Milliseconds

Ideally, this value should be low. A high value could indicate that read and write operations are slowing down for some reason.

Disk busy

Indicates the percentage of time that this disk drive was busy processing the requests.

Percent

Compare the value of this measure across the disk drives to know which disk drive was the busiest and which disk drive was not. If the gap between the two is very high, then it indicates serious irregularities in load-balancing across disk drives.