Monitoring the Dell EqualLogic PS Series SAN Storage

eG Enterprise offers a specialized Dell EqualLogic monitoring model that monitors the core functions and components of the storage device, and proactively alerts administrators to issues in its overall performance and its critical operations, so that the holes are plugged before any data loss occurs.

Figure 1 : The layer model of the Dell EqualLogic SAN storage

Each layer of this model is mapped to tests that monitor a critical component of the device such as the disks, the caches, the storage processors, etc. The eG agent periodically polls the SNMP MIB of the storage device, extracts useful statistics from the storage device and reports it to the eG manager. 

Using these metrics, the following critical performance queries can be answered:

  • Is the storage device available over the network?
  • Is the device responding quickly to client requests or are requests to the device experiencing significant latencies?
  • How many controllers and disks does the device's chassis contain?
  • Are the fans in the storage device operating at normal speeds? Is any fan in an abnormal state?
  • Have any hardware failures occurred recently? If so, which hardware failed?
  • Are all power supply units in the storage device functioning smoothly, or has any unit failed?
  • Is any fan in the power supply unit not operational now?
  • Are the temperature sensors in the device registering normal temperatures, or is any sensor in an abnormal state currently?
  • Does the storage device have adequate disk space resources, or has too much disk space being consumed?
  • Are all disks in the storage device healthy, or are there any unhealthy disks?
  • Do any disks have errors?
  • Has the RAID failed?
  • Are sufficient spare disks available to take the place of ones that may fail?
  • Is the controller cache adequately sized?
  • Does the group's storage pool have enough disk space resources? Has the pool been over-utilized?
  • How many snapshots in the group are currently in use?
  • How many volumes in the group are currently in use?
  • Is the storage device overloaded with connections from initiators?
  • Is the storage device experiencing any read/write latencies?
  • Do the controllers supported by the storage device have enough battery backup? Does any controller have a low voltage or a missing battery?
  • Is any controller's processor experiencing abnormal temperatures?
  • Of the controllers in the storage device, which one is the primary controller and the secondary controller?
  • Is the storage device healthy?
  • Is the temperature of the member array good or bad?
  • Is the member array experiencing any disk space shortage?