Ignite Cache Rebalance Test

When a new node joins the cluster, some of the partitions are relocated to the new node so that the data remains distributed equally in the cluster. This process is called cache rebalancing. If an existing node permanently leaves the cluster and backups are not configured, you lose the partitions stored on this node. When backups are configured, one of the backup copies of the lost partitions becomes a primary partition and the rebalancing process is initiated.

The rebalancing process is key to balancing the cluster nodes and ensuring that data is properly spread across the nodes in the cluster. That's why it is really important to monitor rebalancing process to ensure that it is working efficiently and issues if any, are addressed before it can affect application performance.

This test monitors the rebalancing process and provides key statistics like speed, size etc. which help administrators draw insights about the efficiency of rebalancing process.

Target of the test : Apache Ignite Server

Agent deploying the test : An internal or external agent

Outputs of the test : One set of results for each Apache Ignite Server

Configurable parameters for the test
Parameter	Description
Test period	How often should the test be executed.
Host	Enter the IP address of the Apache Ignite cluster.
Port	Enter the port number on which JMX connector listens to incoming connections requests.
JMX Remote Port	In this text box, enter the name of a virtual warehouse that needs to be monitored. The JMX connector listens on 8686 by default. If it listens on different port in your environment then specify the same.
JMX User	Specify the credentials of the user who is authorized to use JMX.
JMX Password	Specify the password for the authorized user.
Confirm Password	Confirm the password by retyping it here.

Measurements made by the test
Measurement	Description	Measurement Unit	Interpretation
Rebalance clearing partitions left	Indicates the total number of partitions that need be cleared before the rebalancing process could start.	Number	This given an indication of time left before rebalancing will start. If there are too many partitions to be cleared before rebalancing process can start, the overall process time will be high.
Number of already rebalanced keys	Indicates the total number cache keys which are already rebalanced during last few rebalancing sessions.	Number	If there is already large percentage of keys which are already rebalanced, the rebalancing process will be quick. If the process is still taking time you may have to take a look.
Estimated rebalancing speed	Indicates the average speed of data transfer between the nodes during the rebalancing process.	MB/Sec	If the rebalancing speed is trending upwards over a number of measurements, it might be a cause of concern.
Estimated rebalancing speed in keys	Indicates the average number of keys transferred per second between the nodes.	Number
Rebalancing partitions on current node	Indicates the number of partitions on current node which are under rebalancing.	Number	If there are too many partitions being rebalanced on a given node, the process might be slow.
Rebalancing start time	The time when rebalancing of local partitions started for the cache. This metric will return 0 if the local partitions do not participate in the rebalancing.	Number	If rebalancing has been going on for a very long time, the start time might be of great value to administrators.