Clariion LUNs Test

This test reports the current state of each LUN on a storage system, and measures the level of I/O activity on the LUNs.

Target of the test : An EMC CLARiiON storage device

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each LUN that is monitored on the storage system.

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The IP address of the storage device for which this test is to be configured.

Port

The port number at which the storage device listens. The default is NULL.

CLARiiON IP

By default, the host IP will be displayed here. If the eG agent has also been configured to use the SMI-S provider for metrics collection, then the IP address of host on which the SMI-S provider has been installed, will be displayed here by default. In this case, you should change the value of this parameter to reflect the IP address of the EMC CLARiiON storage device. However, if the eG agent uses only the NaviSphere CLI for monitoring, then the default settings can remain. 

NaviseccliPath

The eG agent uses the command-line utility, NaviSecCli.exe, which is part of the NaviSphere Management Suite, to communicate with and monitor the storage device. To enable the eG agent to invoke the CLI, configure the full path to the CLI in the NaviseccliPath text box.

User Name and Password

Provide the credentials of a user who is authorized to access the storage device in the User Name and Password text boxes.

Confirm Password

Confirm the password by retyping it here.

Ignore Disabled LUNs

By default, this flag is set to No, indicating that the test monitors all LUNs by default. Set this flag to Yes if you want the test to consider only the 'enabled' LUNs for monitoring.

Exclude LUNs

Provide a comma-separated list of LUNs that you want to exclude from the monitoring scope of this test. By default, this is set to none indicating that no LUNs are excluded by default.

Timeout

Indicate the duration (in seconds) for which this test should wait for a response from the storage device. By default, this is set to 120 seconds. Note that the 'Timeout' value should always be set between 3 and 600 seconds only.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency. 

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormalfrequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

LUN binding completion

Indicates the current state of this LUN.

Status

If the state reported by this measure is Bound, it indicates that the LUN is currently in a bound state. A bind creates LUNs on a RAID GROUP. Binding a LUN involves the preparation of allocated storage space. This preparation is particularly important when storage capacity is being reallocated for reuse.

LUNs are bound after RAID GROUPS are created. LUNs are available for use immediately after they are created, but the bind is not strictly complete until after all the bound storage has been prepared and verified.

During the preparation step, the storage allocated to the LUN is overwritten with binary zeroes. These zeroes erase any previous data from the storage and set up for the parity calculation. When zeroing is complete, parity and metadata is calculated for the LUN sectors.

If the state reported by this measure is Unbound, it indicates that the LUN is currently in an unbound state.

The numeric values that correspond to each of the states discussed above are as follows:

State Numeric Value
Bound 1
Not bound 0

Note:

By default, this measure reports the values Bound or Unbound to indicate the state of a LUN. The graph of this measure however, represents the LUN state using the numeric equivalents - 0 or 1.

Use the detailed diagnosis of this measure to view additional details of a LUN.

Total hard errors

Indicates the number of hard errors on this LUN.

Number

The values and their respective states are listed below:

  • 10 - critical
  • 5 - major
  • 2 - minor

Increase in the value of this measure indicates that the LUN life is going to end or fail.

Total soft errors

Indicates the total number of uncorrected read and write errors on this LUN.

Number

The values and their respective states are listed below:

  • 10 - critical
  • 5 - major
  • 2 - minor

Increase in value of this measure indicates disk life is going to end or fail.

Average queue requests

Indicates the average number of requests to this LUN that are in queue.

Number

A very high value could indicate a processing bottleneck on this LUN.

Current read cache hits

Indicates the number of times read requests to this LUN were fulfilled by the read cache.

Number

A high value is desired for this measure.

Current write cache hits

Indicates the number of times write requests to this LUN were fulfilled by the write cache.

Number

A high value is desired for this measure.

Read cache misses

Indicates the number of times read requests to this LUN were not serviced by the read cache.

Number

Ideally, the value of this measure should be low.

Read hit ratio

Indicates the percentage of read requests to this LUN that were serviced by the cache

Percent

Ideally, the value of this measure should be high. A low value indicates that many read requests are serviced by direct disk accesses, which is a more expensive operation in terms of processing overheads.

Write hit ratio

Indicates the percentage of write requests to this LUN that were serviced by the cache.

Percent

Ideally, the value of this measure should be high. A low value indicates that data is often directly written to the disk, which is a more expensive operation in terms of processing overheads.

Read requests

Indicates the number of read requests made per second to this LUN.

Reqs/Sec

Comparing the value of these measures across LUNs will clearly indicate which LUN is the busiest in terms of the number of read and write requests handled – it could also shed light on irregularities in load balancing across the LUNs. 

Write requests

Indicates the number of write requests made per second to this LUN.

Reqs/Sec

Data reads

Indicates the rate at which data was read from this LUN.

Blocks/Sec

Comparing the value of these measures across LUNs will clearly indicate which LUN is the busiest in terms of the rate at which data is read and written – it could also shed light on irregularities in load balancing across the LUNs. 

Data writes

Indicate the rate at which data was written to this LUN.

Blocks/Sec

Total I/O

Indicates the rate of the I/O activity on this LUN.

Number

 

Rebuild process completion

Indicates the percentage of this LUN that has been rebuilt.

Percent

A rebuild replaces a failed hard disk within a RAID group with an operational disk. If one or more LUNs are bound to the RAID group with the failed disk, then, all the LUNs affected by the failure are rebuilt. A rebuild restores a LUN to its fully assigned number of hard drives using an available hot spare should a drive in one of the RAID groups fail. LUNs are rebuilt one by one. Each LUN is rebuilt by its owning Storage Processor (SP).

LUN binding completion

Indicates the percentage of the LUN binding process that is complete. 

Percent

A bind is an information organization, data security, and data integrity feature of CLARiiON. Binding a LUN involves the preparation of allocated storage space. This preparation is particularly important when storage capacity is being reallocated for reuse. This reuse of storage includes erasing any previous data found on the hard drives, and the setting of parity and metadata for the storage.

LUNs are typically available for use immediately after they are bound. However, the bind is not strictly complete until after all the bound storage has been prepared and verified. Depending on the LUN size and verify priority, these two steps may take several hours. Using the value of this measure, you will be able to track the progress of the binding function, and will be able to gauge how much longer it will take for the binding to complete.

LUN capacity

Indicates the total capacity of this LUN.

GB

 

LUN size

Indicates the LUN size in blocks.

Blocks

 

The detailed diagnosis of the State measure reveals whether the target LUN is a private LUN or not, the Raid group to which the LUN belongs, the Raid type, and the storage group name.

ddclariionlunsteststatemeas

Figure 1 : The detailed diagnosis of the State measure