Storm Nimbus Test

Apache Storm has two type of nodes, Nimbus (master node) and Supervisor (worker node). Nimbus is the central component of Apache Storm. The main job of Nimbus is to run the Storm topology. Nimbus analyzes the topology and gathers the task to be executed.

Administrators may schedule periodic reboots of the Nimbus master nodes. By knowing that a specific node has been up for an unusually long time, an administrator may come to know that the scheduled reboot task is not working on a node.

This test included in the eG agent monitors the uptime of critical nodes and also alerts if the Nimbus node is offline.

Target of the test : Apache Storm

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for each Nimbus node in the target Apache Storm.

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed.

Host

The IP address of the target server that is being monitored.

Port

The port number through which the Apache Storm communicates. The default port is 8080.

SSL

By default, the SSL flag is set to No, indicating that the target Apache Storm is not SSL-enabled by default. To enable the test to connect to an SSL-enabled Apache Storm, set the SSL flag to Yes.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Status

Indicates the status of the Nimbus master node.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure Value

Numeric Value
Offline 0
Not a Leader 90
Leader 100

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the status of the Nimbus master node.

Uptime

Indicates the time period that the Nimbus master node has been up since the last time this test ran.

Hrs/Mins/Secs

If the Nimbus master node has not been rebooted during the last measurement period and the agent has been running continuously, this value will be equal to the measurement period. If the Nimbus master node was rebooted during the last measurement period, this value will be less than the measurement period of the test. For example, if the measurement period is 300 secs, and if the Nimbus master node was rebooted 120 secs back, this metric will report a value of 120 seconds. The accuracy of this metric is dependent on the measurement period - the smaller the measurement period, greater the accuracy.

Rebooted

Indicates whether the Nimbus master node has been rebooted during the last measurement period or not.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure Value

Numeric Value
Yes 0
No 1

Note:

By default, this measure reports the Measure Values listed in the table above to indicate whether the Nimbus master node has been rebooted during the last measurement period or not

If this measure shows 1, it means that the Nimbus master node was rebooted during the last measurement period. By checking the time periods when this metric changes from 0 to 1, an administrator can determine the times when this Nimbus master node was rebooted.