Vnx Disks Test
This test monitors the current state, overall health, and the I/O activity-levels of each disk in the EMC VNX Unified storage system. With the help of this test, administrators can not only identify failed disks, but also those that are error-prone and may fail any time, so that they can endeavor to avert the potential disk failure. In addition, the test also points administrators to disks that are busy processing I/O requests almost all the time. This way, the test sheds light on irregularities in the distribution of I/O load across disks, and prompts administrators to fine-tune the load-balancing algorithm, so as to prevent potential delays in data access. In addition, the test also proactively alerts administrators to probable space contentions in disks and excessive bandwidth consumption by disks, thereby enabling administrators to initiate pre-emptive actions.
Target of the test : An EMC VNX Unified Storage system
Agent deploying the test : A remote agent
Outputs of the test : One set of results for each disk on the EMC VNX Unified Storage system.
Parameter | Description |
---|---|
Test Period |
How often should the test be executed. |
Host |
The IP address of the storage device for which this test is to be configured. |
Port |
The port number at which the storage device listens. The default is NULL. |
Controller IP |
Specify the IP address of the storage controller on the block-only storage system in the Controller IP text box. By default, the IP address of the Host will be assigned in the Controller IP text box. |
NaviseccliPath |
The eG agent uses the command-line utility, NaviSecCli.exe, which is part of the NaviSphere Management Suite, to communicate with and monitor the storage device. To enable the eG agent to invoke the CLI, configure the full path to the CLI in the NaviseccliPath text box. |
User Name and Password |
Provide the credentials of a user with Administrator rights to the storage controller in the User Name and Password text boxes. |
Confirm Password |
Confirm the password by retyping it here. |
User Scope |
To use the NaviSphere CLI, the eG agent needs to be configured with a User Scope. Scope defines the access radius of the user account (User and Password) that you have configured for this test. Set User Scope to Local if the user account you have configured for this test applies to the monitored storage system only. Set User Scope to Global if the user account you have configured applies to all the storage systems within a domain. |
Timeout |
Indicate the duration (in seconds) for which this test should wait for a response from the storage device. By default, this is set to 120 seconds. Note that the 'Timeout' value should always be set between 3 and 600 seconds only. |
Ignore Disabled Disks |
If you do not wish to monitor the disks that are disabled in the target environment, set the Ignore Disabled Disks flag to Yes. By default, this flag is set to No. |
Exclude Disks |
Specify a comma-separated list of disks that you wish to exclude from the scope of monitoring in the Exclude Disks text box. By default, none is displayed here. |
DD Frequency |
Refers to the frequency with which detailed diagnosis measures are to be generated for this test. For instance, if you set to 1:1, it means that detailed measures will be generated every time this test runs, and also every time the test detects a problem. By default, the DD Frequency is set to 4:1. |
Measurement | Description | Measurement Unit | Interpretation | ||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Busy ticks |
Indicates the percentage of time for which this disk was busy. |
Percent |
A value close to 100% is a cause for concern, as it indicates a potential I/O overload on the disk. If the problem persists, it is a sign that serious load-balancing irregularities exist and need to be looked into. |
||||||||||||||||||||||||||||||||||||||||||
Total capacity |
Indicates the total size of this disk. |
GB |
|
||||||||||||||||||||||||||||||||||||||||||
Data reads |
Indicates the rate at which data is read from this disk. |
MB/Sec |
Compare the value of these measures across disks to identify the slowest disk in terms of servicing read and write requests (respectively). |
||||||||||||||||||||||||||||||||||||||||||
Data writes |
Indicates the rate at which data is written to this disk. |
MB/Sec |
|||||||||||||||||||||||||||||||||||||||||||
Hard read errors |
Indicates the number of hard read errors in this disk. |
Number |
An increase in the value of these measures indicates that the disk life is going to end or fail. By comparing the value of these measures across disks, you can identify the disk that will potentially fail. |
||||||||||||||||||||||||||||||||||||||||||
Hard write errors |
Indicates the number of hard write errors in this disk. |
Number |
|||||||||||||||||||||||||||||||||||||||||||
Soft read errors |
Indicates the number of soft read errors in this disk. |
Number |
|||||||||||||||||||||||||||||||||||||||||||
Soft write errors |
Indicates the number of soft write errors in this disk. |
Number |
|||||||||||||||||||||||||||||||||||||||||||
Read requests |
Indicates the rate at which read requests were made to this disk. |
Reqs/sec |
Compare the value of these measures across disks to isolate overloaded disks. This will also reveal irregularities in load balancing across disks. |
||||||||||||||||||||||||||||||||||||||||||
Write requests |
Indicates the rate at which write requests were made to this disk. |
Reqs/sec |
|||||||||||||||||||||||||||||||||||||||||||
LUNs |
Indicates the number of LUNs that are sharing this disk. |
Number |
Use the detailed diagnosis of this measure to know which LUNs are sharing this disk. |
||||||||||||||||||||||||||||||||||||||||||
Read retries |
Indicates the number of times read requests to this disk were retried. |
Number |
A low value is desired for this measure. |
||||||||||||||||||||||||||||||||||||||||||
Remapped sectors |
Indicates the number of sectors on this disk that were remapped to new locations on the disk due to read/write errors. |
Number |
A low value is desired for this measure. |
||||||||||||||||||||||||||||||||||||||||||
Request service time |
Indicates the time taken by this disk to service requests. |
Secs |
A high value is typically indicative of an I/O processing bottleneck in the disk. Compare the value of this measure across disks to know which disks are experiencing significant latencies. |
||||||||||||||||||||||||||||||||||||||||||
State |
Indicates the current state of the disk. |
|
The values that this measure can report and their corresponding numeric values are indicated in the table below:
Note: By default, this measure reports any of the above-mentioned Measure Values while indicating the status of the disk. However, in the graph of this measure, the same will be represented using their numeric equivalents only - i.e., 0 to 19. |
||||||||||||||||||||||||||||||||||||||||||
Total bandwidth |
This measure indicates the sum of data reads and data writes to this disk. |
MB/Sec |
Compare the value of this measure across disks to identify the disk that is consuming the maximum bandwidth. |
||||||||||||||||||||||||||||||||||||||||||
Usage |
Indicates the percentage of space in this disk that is currently utilized. |
Percent |
Ideally, the value of this measure should be low. A consistent increase in this value could indicate a gradual, but steady erosion of space in the disk. A value close to 100% indicates that the disk is rapidly running out of space. |
||||||||||||||||||||||||||||||||||||||||||
User capacity |
Indicates the amount of space on this disk that is assigned to bound LUNs. |
GB |
|