EMC PowerVault ME Volumes Test

A volume is a logical subdivision of a vdisk, and can be mapped to controller host ports for access by hosts. A mapped volume provides the storage for a file system partition you create with your operating system or third-party tools. The storage system presents only volumes, not vdisks, to hosts. This is why, if a single volume in the EMC PowerVault MEstorage system is unable to process I/O requests from hosts quickly, it can rupture the user experience with the entire storage system. Therefore, to improve fault tolerance and I/O performance of a volume, you can set cache options for individual volumes. A well-tuned cache can go a long way in reducing direct volume accesses and related I/O processing overheads. In the absence of such a cache, processing slowdowns become inevitable! In times of heavy load, weak load-balancing algorithms can aggravate the slowdown, thereby adversely impacting the user experience with the storage system. To avoid this, administrators need to continuously monitor the I/O load, the processing ability, and the cache usage of every volume in the storage system, proactively detect an I/O processing latency, rapidly determine the exact cause of the poor I/O performance – is it an improperly configured cache? Or an ineffective load-balancing engine? Or both? - and promptly initiate measures to rectify the root-cause, so that normalcy of operations can be restored. This is where the EMC PowerVault MEVolumes test helps.

This test auto-discovers the volumes in the storage system and reports how well each volume handles the I/O requests it receives. In addition, the test also focuses on the cache usage of every volume from time to time, and reveals whether/not any cache has been grossly under-utilized. This way, the test turns the spotlight on volumes that are experiencing a slowdown and also reveals what is causing the slowdown – load-balancing irregularities across volumes or badly configured caches?

Target of the test : A EMC PowerVault ME storage system

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each volume being monitored

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed .

Host

The host for which the test is to be configured. Since the storage device is managed using the IP address of its storage controller, the same will be displayed as host. In case of a dual-controller configuration, the IP address of the primary controller will be displayed here.

Port

The port number at which the specified host listens. By default, this is NULL.

Additional Controller IP

By default, this test always connects to the Host to collect metrics. If the Host is unavailable, then the test will not be able to execute. This is because, the Additional Controller IP is set to none by default.

If the monitored storage device has two controllers, then you can configure the test to connect to an alternate controller, if the host is unreachable. For this purpose, specify the IP address of the alternate controller in the Additional Controller IP text box.

User and Password

In order to monitor a EMC PowerVault ME storage system, the eG agent has to be configured with the credentials of a user who has been assigned the Monitor role. Specify the login credentials of such a user in the User and Password text boxes. To know how to create such a user, refer to Pre-requisites for monitoring the EMC PowerVault ME storage system.

Confirm Password

Confirm the password by retyping it here.

ServicePort

The Management Controller of the EMC PowerVault MEstorage system provides access for monitoring and management via the HTTP and HTTPS protocols for XML API request/response semantics. To enable the eG agent to access the management controller, invoke the XML API commands, and collect the required metrics, you need to specify the service port on the controller that listens for HTTP/HTTPS requests for XML API semantics. By default, this is port 80.

Timeout

Specify the time duration for which this test should wait for a response from the storage system in the Timeout text box. By default, this is 60 seconds.

SSL

By default, EMC PowerVault ME system is not SSL-enabled. This is why, this flag is set to False by default. If it is SSL-enabled, then change this flag to True.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Data transmitted

Indicates the rate at which data is transmitted through this volume of this vdisk during the last measurement period.

MB/Sec

This is a good indicator of the load on the volume. You can compare the value of this measure across volumes to figure out whether the load has been distributed uniformly across all volumes or a few volumes are overloaded. In case of the latter, you may have to fine-tune the load-balancing algorithm used.

IOPS

Indicates the rate at which the I/O operations were performed by this volume during the last measurement period.

IOPS

This measure serves as a good indicator of the I/O processing ability of the volume. A consistent drop in this value is hence a cause for concern, as it indicates a processing slowdown.

Reads

Indicates the rate at which the read operations were performed on this volume during the last measurement period.

Reads/Sec

Ideally, the value of this measure should be high. A steady dip in this measure value could indicate a potential reading bottleneck. Under such circumstances, you may want to check the value of the Read cache hits and Read cache misses measure to figure out whether under-utilization of the cache is the cause for the reading delay.

Read cache hits

Indicates the rate at which the blocks were read from the cache instead of this volume.

 

Hits/Sec

 

Ideally, the value of the Read cache hits measure should be high, and the value of the Read cache misses measure should be low. A consistent drop in cache hits and a steady increase in cache misses during the same time frame is indicative of ineffective read cache usage, which can lead to a slowness in read request servicing. To improve read cache usage, you may want to consider turning on read-ahead caching. You can optimize a volume for sequential reads or streaming data by changing its read-ahead cache settings. Read ahead is triggered by two back-to-back accesses to consecutive LBA ranges, whether forward (increasing LBAs) or reverse (decreasing LBAs).

You can change the amount of data read in advance after two back-to-back reads are made. Increasing the read-ahead cache size can greatly improve performance for multiple sequential read streams; however, increasing read-ahead size will likely decrease random read performance.

  • The Default option works well for most applications: it sets one chunk for the first access in a sequential read and one stripe for all subsequent accesses. The size of the chunk is based on the chunk size used when you created the vdisk (the default is 64 KB). Non-RAID and RAID-1 vdisks are considered to have a stripe size of 64 KB.
  • Specific size options let you select an amount of data for all accesses.
  • The Maximum option lets the controller dynamically calculate the maximum read-ahead cache size for the volume. For example, if a single volume exists, this setting enables the controller to use nearly half the memory for read-ahead cache. Only use Maximum when disk latencies must be absorbed by cache. For example, for read-intensive applications, you will want data that is most often read to be in cache so that the response to the read request is very fast; otherwise, the controller has to locate which disks the data is on, move it up to cache, and then send it to the host.
  • Do not use Maximum if more than two volumes are owned by the controller on which the read-ahead setting is being made. If there are more than two volumes, there is contention on the cache as to which volume’s read data should be held and which has the priority; each volume constantly overwrites the other volume’s data in cache, which could result in taking a lot of the controller’s processing power.
  • The Disabled option turns off read-ahead cache. This is useful if the host is triggering read ahead for what are random accesses. This can happen if the host breaks up the random I/O into two smaller reads, triggering read-ahead. 

Read cache misses

Indicates the rate at which the read requests to this volume were not serviced by the read cache.

Misses/Sec

 

Writes

Indicates the rate at which the write operations were performed on this volume during the last measurement period.

Writes/Sec

Ideally, the value of this measure should be high. A steady dip in this measure value could indicate a potential writing bottleneck. Under such circumstances, you may want to check the value of the Write cache hits and Write cache misses measures to figure out whether under-utilization of the cache is the cause for the writing delay. 

Write cache hits

Indicates the rate at which the write requests to this volume were fulfilled by the write cache.

Hits/Sec

Ideally, the value of the Write cache hits measure should be high, and the value of the Write cache misses measure should be low. A consistent drop in cache hits and a steady increase in cache misses during the same time frame is indicative of ineffective read cache usage, which can lead to a slowness in write request servicing. To improve write cache usage, you may want to consider changing that volume’s write-back cache setting.

Write-back is a cache-writing strategy in which the controller receives the data to be written to disks, stores it in the memory buffer, and immediately sends the host operating system a signal that the write operation is complete, without waiting until the data is actually written to the disk. Write-back cache mirrors all of the data from one controller module cache to the other. Write-back cache improves the performance of write operations and the throughput of the controller.

Write cache misses

Indicates the rate at which the write requests to this volume were not serviced by the write cache.

Misses/Sec

Data reads

Indicates the rate at which the data was read from this volume during the last measurement period.

MB/Sec

Comparing the value of these measures across the volumes will clearly indicate which volume is the busiest in terms of the rate at which data is read and written - it could also shed light on irregularities in load balancing across the volumes.

Data writes

Indicates the rate at which the data was written to this volume during the last measurement period.

MB/Sec

Small destages

Indicates the number of times the flush from the cache to this volume is not a full stripe.

Number

 

Full stripe write destages

Indicates the number of times the flush from the cache to this volume is a full stripe.

Number

 

Write cache

Indicates the percentage of space in the write cache of this volume that is currently used.

Percent

A value close to 100% is a cause for concern, as it indicates that the cache is running out of space and may not be able to service subsequent write requests. In such a situation, you are advised to resize the write cache in order to accommodate more entries, so that the cache continues to handle requests and reduce processing overheads.