Dell Compellent LUNs Test

A logical unit number (LUN) is a unique identifier used to designate individual or collections of hard disk devices for address by a protocol associated with a SCSI, iSCSI, Fibre Channel (FC) or similar interface. LUNs are central to the management of storage arrays shared over a storage area network (SAN). LUN errors, poor LUN cache usage, and abnormal I/O activity on the LUNs, if not promptly detected and resolved, can hence significantly degrade the performance of the storage array. This is why, it is important that LUN performance is continuously monitored. This can be achieved using the Dell Compellent LUNs test. This test auto-discovers the LUNs in the Storage Center and reports the current state of each LUN, captures LUN errors, and measures the level of I/O activity on every LUN, so that administrators are notified of LUN-related problems well before they impact Storage Center performance.

Target of the test : A Dell Compellent Storage Center

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each LUN on the Dell Compellent Storage Center.

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed

Host

The storage device for which the test is to be configured.

Port

The port number at which the specified storage device listens to. By default, this will be NULL.

User and Password

Specify the credentials of a user who has the right to execute API commands on the Dell Compellent Storage Center and pull out metrics. The specified user is the Pegasus CIM User who should possess Administator privileges and should be associated with the Logon as a Service policy. 

Confirm Password

Confirm the password by retyping it here.

SSL

Set this flag to Yes, if the storage device being monitored is SSL-enabled.

CIM Server Port

The SMI–S provider of the Dell Compellent Storage Center provides access for monitoring and management via the HTTP and HTTPS protocols for CIM API request/response semantics. To enable the eG agent to access the SMI-S Provider, invoke the CIM API commands, and collect the required metrics, you need to specify the service port on the SMI- S provider in the CIM Server Port text box that listens for HTTP/HTTPS requests for CIM API semantics. By default, this is port 5988. If the service port on the SMI-S Provider listens only to HTTPS requests, then specify the port as 5989.

IsEmbedded

If this flag is set to True, it indicates that the SMI-S Provider is embedded on the Storage Center. On the other hand, if this flag is set to False, it indicates that the SMI-S Provider has been implemented as a proxy.

SerialNumber

If the SMI-S Provider has been implemented as a proxy, then such a provider can be configured to manage multiple Storage Centers. This is why, if the IsEmbedded flag is set to False, you will have to explicitly specify which Storage Center you want the eG agent to monitor. Since each Storage Center is uniquely identified by a serial number, specify the same in the SerialNumber text box.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise system embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option against Detailed Diagnosis. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Health state

Indicates how healthy this LUN currently is.

 

The values that this measure can report and their corresponding numeric values are discussed in the table below:

Numeric Value Measure Value
0 OK
1 Unknown
2 Degraded/Warning
3 Minor failure
4 Major failure
5 Critical failure
6 Non-recoverable error

Note:

By default, this measure reports the Measure Values discussed above to indicate the state of a LUN In the graph of this measure however, states are represented using the numeric equivalents only.

The detailed diagnosis of this measure if enabled, lists the capacity of the LUN.

Operational status

Indicates the current operational state of this LUN.

 

The values that this measure can report and their corresponding numeric values are discussed in the table below:

Numeric Value Measure Value
0 OK
1 In Service
2 Power Mode
3 Completed
4 Starting
5 Dormant
6 Other
7 Unknown
8 Stopping
9 Stressed
10 Stopped
11 Supporting Entity in Error
12 Degraded or Predicted Failure
13 Predictive Failure
14 Lost Communication
15 No Contact
16 Aborted
17 Error
18 Non-Recoverable Error

Note:

By default, this measure reports the Measure Values discussed above to indicate the operational state of a LUN. In the graph of this measure however, operational states are represented using the numeric equivalents only.

Detailed operational state

Describes the current operational state of this LUN.

 

Typically, the detailed state will describe why the LUN is in a particular operational state. For instance, if the Operational status measure reports the value Stopping for a LUN, then this measure will explain why that LUN is being stopped.

The values that this measure can report and their corresponding numeric values are discussed in the table below:

Numeric Value Measure Value
0 Online
1 Success
2 Power Saving Mode
3 Write Protected
4 Write Disabled
5 Not Ready
6 Removed
7 Rebooting
8 Offline
9 Failure

Note:

By default, this measure reports the Measure Values discussed above to indicate the detailed operational state of a LUN. In the graph of this measure however, detailed operational states are represented using the numeric equivalents only.

Data transmitted

Indicates the rate at which data was transmitted by this LUN.

MB/Sec

 

IOPS

Indicates the rate at which I/O operations were performed on this LUN.

IOPS

Compare the value of this measure across LUNs to know which LUN handled the maximum number of I/O requests and which handled the least. If the gap between the two is very high, then it indicates serious irregularities in load-balancing across LUNs.

You may then want to take a look at the Reads and Writes measures to understand what to fine-tune – the load-balancing algorithm for read requests or that of the write requests.

Reads

Indicates the rate at which read operations were performed on this LUN.

Reads/Sec

Compare the value of this measure across LUNs to know which LUN handled the maximum number of read requests and which handled the least.

Writes

Indicates the rate at which write operations were performed on this LUN.

Writes/Sec

Compare the value of this measure across LUNs to know which LUN handled the maximum number of write requests and which handled the least.

Data reads

Indicates the rate at which data is read from this LUN.

MB/Sec

Compare the value of these measures across LUNs to identify the slowest LUN in terms of servicing read and write requests (respectively).

Data writes

Indicates the rate at which data is written to this LUN.

MB/Sec

LUNs busy

Indicates the percentage of time this LUN was busy processing requests.

Percent

Compare the value of this measure across LUNs to know which LUN was the busiest and which LUN was not. If the gap between the two is very high, then it indicates serious irregularities in load-balancing across LUNs.

Average read size

Indicates the amount of data read from this LUN per I/O operation.

MB/Op

Compare the value of these measures across LUNs to identify the slowest LUN in terms of servicing read and write requests (respectively).

Average write size

Indicates the amount of data written to this LUN per I/O operation.

MB/Op

Read hits

Indicates the percentage of read requests that were serviced by the cache of this LUN.

Percent

A high value is desired for this measure. A very low value is a cause for concern, as it indicates that cache usage is very poor; this in turn implies that direct LUN accesses, which are expensive operations, are high.

Write hits

Indicates the percentage of write requests that were serviced by the cache of this LUN.

Percent

A high value is desired for this measure. A very low value is a cause for concern, as it indicates that cache usage is very poor; this in turn implies that direct LUN accesses, which are expensive operations, are high.

Average response time

Indicates the time taken by this LUN to respond to I/O requests.

Microsecs

Ideally, this value should be low. If not, it implies that the LUN is slow.

Queue depth

Indicates the number of requests that are in queue for this LUN.

Number

A consistent increase in this value indicates a potential processing bottleneck with the LUN.