Ignite Cache Rebalance Test
When a new node joins the cluster, some of the partitions are relocated to the new node so that the data remains distributed equally in the cluster. This process is called cache rebalancing. If an existing node permanently leaves the cluster and backups are not configured, you lose the partitions stored on this node. When backups are configured, one of the backup copies of the lost partitions becomes a primary partition and the rebalancing process is initiated.
The rebalancing process is key to balancing the cluster nodes and ensuring that data is properly spread across the nodes in the cluster. That's why it is really important to monitor rebalancing process to ensure that it is working efficiently and issues if any, are addressed before it can affect application performance.
This test monitors the rebalancing process and provides key statistics like speed, size etc. which help administrators draw insights about the efficiency of rebalancing process.
Target of the test : Apache Ignite Server
Agent deploying the test : An internal or external agent
Outputs of the test : One set of results for each Apache Ignite Server
Parameter |
Description |
---|---|
Test period |
How often should the test be executed. |
Host |
Enter the IP address of the Apache Ignite cluster. |
Port |
Enter the port number on which JMX connector listens to incoming connections requests. |
JMX Remote Port |
In this text box, enter the name of a virtual warehouse that needs to be monitored. The JMX connector listens on 8686 by default. If it listens on different port in your environment then specify the same. |
JMX User |
Specify the credentials of the user who is authorized to use JMX. |
JMX Password |
Specify the password for the authorized user. |
Confirm Password |
Confirm the password by retyping it here. |
Measurement |
Description |
Measurement Unit |
Interpretation |
---|---|---|---|
Rebalance clearing partitions left |
Indicates the total number of partitions that need be cleared before the rebalancing process could start. |
Number |
This given an indication of time left before rebalancing will start. If there are too many partitions to be cleared before rebalancing process can start, the overall process time will be high. |
Number of already rebalanced keys |
Indicates the total number cache keys which are already rebalanced during last few rebalancing sessions. |
Number |
If there is already large percentage of keys which are already rebalanced, the rebalancing process will be quick. If the process is still taking time you may have to take a look. |
Estimated rebalancing speed |
Indicates the average speed of data transfer between the nodes during the rebalancing process. |
MB/Sec |
If the rebalancing speed is trending upwards over a number of measurements, it might be a cause of concern.
|
Estimated rebalancing speed in keys |
Indicates the average number of keys transferred per second between the nodes. |
Number |
|
Rebalancing partitions on current node |
Indicates the number of partitions on current node which are under rebalancing. |
Number |
If there are too many partitions being rebalanced on a given node, the process might be slow. |
Rebalancing start time |
The time when rebalancing of local partitions started for the cache. This metric will return 0 if the local partitions do not participate in the rebalancing. |
Number |
If rebalancing has been going on for a very long time, the start time might be of great value to administrators. |