Monitoring EMC VNX Unified Storage

eG Enterprise offers a specialized Vnx Unified Storage monitoring model that monitors each of the key indicators of the performance of EMC VNX - such as the disks, file systems, volumes, DAEs, LUNs, etc. - and proactively alerts administrators to potential performance bottlenecks, so that administrators can resolve the issues well before end-users complain.

Figure 1 : The layer model of EMC VNX Unified Storage

Each layer of Figure 1.2 above is mapped to a variety of tests, each of which report a wealth of performance information related to the VNX unified storage. Using these metrics, administrators can find quick and accurate answers to the following performance queries:

Is the VNX storage system available over the network?
How responsive is VNX to requests over the network?
Are all the hardware components of the VNX storage system up and running? If not, which hardware component is unavailable - is it the fan? the power supply unit? or the LCC?
Is the VNX storage system using network bandwidth optimally? If not, which NIC on VNX is consuming bandwidth excessively?
Is any disk too busy? If so, which one is it?
Which disk is too slow in processing I/O requests? What type of I/O requests does it process very slowly - read or write requests?
Has any disk failed?
Is any disk consuming too much bandwidth? If so, which one is it?
Which disk is running out of disk space?
Are the read and write storage processor (SP) caches used optimally? Which storage processor's cache may require right-sizing, and which cache is it - read or write?
Which SP port is down currently?
Is the SFP (small form-factor pluggable module) of any SP port faulted?
Is any SP over-utilized?
Is any SP idle?
How are the data movers using their caches? Which cache's usage is most ineffective ineffective - Directory name lookup cache, Open file cache, or kernel buffer cache?
Which data mover has too many blocked threads?
Which data mover is experiencing a CPU and/or RAM contention?
Is the statmon service on any data mover not running currently?
Which data mover is processing the CIFS read/write requests to it very slowly?
Which data mover is processing the NFS read/write requests to it very slowly?
Which file system on which data mover is using too much storage space?
Are too many I/O requests in queue for any LUN? If so, which LUN is it?
Which LUN is experiencing too many errors? What type of errors are these - hard or soft?
Is any LUN making poor use of its read/write cache?
Which disk volume is running out of space?
Which disk volume has too many pending I/O requests?
Which meta volume is experiencing a processing bottleneck?