Monitoring Apache Storm

eG Enterprise offers a special-purpose monitoring model for the Apache Storm to monitor the status and overall performance of the target Apache Storm.

Figure 1 depicts the layer model of an Apache Storm.

Figure 1 : Layer model for Apache Storm

Every layer in the Figure 1 is mapped to various tests to determine the critical statistics related to the performance of the target Apache Storm. Using the metrics reported by the tests, administrators can find accurate answers for the following performance queries:

  • Are there sufficient number of Supervisors, Topologies and Executors in Storm cluster?

  • Are the assigned tasks completed by the Storm cluster?

  • Is the size of total, used and available memory in Storm cluster are enough for data processing?

  • Is the CPU availability is stable?

  • Are the worker slots frequently used in the Storm cluster?

  • Is the status of Nimbus master node is offline?

  • Did any unscheduled reboots of the Nimbus node occur recently?

  • Does the Nimbus node took maximum uptime since its last reboot?

  • Is there sufficient count of total, used and free worker slots in Supervisor node?

  • Is there sufficient count of total, used and available CPU cores in Supervisor node?

  • Is the CPU for Supervisor node is sufficient for data processing?

  • Is the size of total, used and available memory in Supervisor node enough for data processing?

  • Did any unscheduled reboots of the Supervisor node occur recently?

  • Does the Supervisor node took maximum uptime since its last reboot?

  • Are there sufficient number of Topologies, Executors and Workers in Supervisor node?

  • Are the assigned tasks completed by the owner (user)?

  • Is the total size of memory used by the owner is too high?

  • Is the count of CPU cores guaranteed for the owner is sufficient?

  • Is the count of isolated nodes for the owner is high?

  • Is the size of heap/off-heap memory assigned to the owner are insufficient?

  • Is the status of Topology in the target Apache Storm is active?

  • Are there sufficient number of Executors and Workers in Storm Topology?

  • Are the assigned tasks completed by Storm Topology?

  • Is the count of Nimbus hosts range high?

  • Is the size of total assigned memory for storm topology is not sufficient?

  • Is the size of heap/off-heap memory assigned to storm topology are adequate?

  • Is the count of CPU cores assigned for storm topology is sufficient?

  • Does the Topology took maximum uptime since its last reboot?

  • Is there sufficient number of Executors in Workers node?

  • Is the size of heap/off-heap memory assigned to workers node are adequate?

  • Are the CPU cores assigned to workers node are sufficient?

  • Does the worker node took maximum uptime since its last reboot?

Since the Operating System, Application Processes, Windows Service and TCP layers have been elaborately discussed in Monitoring Unix and Windows Servers document, the tests mapped to the Network Layer have been elaborately discussed in Monitoring Cisco Router document, and the tests mapped to the JVM layer have already been discussed in Monitoring Java Applications document, the sections to come will discuss the other layers in detail.