AWS MSK Volume Test

This test periodically checks the health and availability status of each volume used by the EC2 instances in the monitored region and notifies administrators if any volume is in an abnormal state. Similarly, the test also tracks the I/O load on every volume and measures how well each volume processes the load - overloaded volumes and those that are experiencing processing hiccups are highlighted in the process.

Target of the test : AWS Managed Service Kafka

Agent deploying the test : A remote agent

Outputs of the test : One set of results for the target AWS Managed Service Kafka server.

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The IP address of the AWS Managed Service Kafka Broker that is being monitored.

Port

Specify the port number at which the specified HOST listens. By default, this is NULL.

AWS Default Region

This test uses AWS CLI to interact with AWS Managed Service Kafka and pull relevant metrics. To enable the test to connect to AWS, you need to configure the test with the name of the region to which all requests for metrics should be routed, by default. Specify the name of this AWS Default Region, here.

AWS Access Key ID, AWS Secret Access Key and Confirm Password

To monitor AWS Managed Service Kafka, the eG agent has to be configured with the access key and secret key of a user with a valid AWS account. For this purpose, we recommend that you create a special user on the AWS cloud, obtain the access and secret keys of this user, and configure this test with these keys. The procedure for this has been detailed in the Obtaining an Access key and Secret key topic. Make sure you reconfirm the access and secret keys you provide here by retyping it in the corresponding Confirm Password text box.

Timeout Seconds

Specify the maximum duration (in seconds) for which the test will wait for a response from the server. The default is 10 seconds.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Volume queue length

Indicates the number of read and write operation requests waiting to be completed.

Number

A consistent increase in the value of this measure could indicate a I/O processing bottleneck on the volume.

Volume read data

Indicates the rate at which data was read from this volume.

MB

Compare the value of this measure to identify the volume that is the slowest in responding to read requests.

Volume read operations

Indicates the rate at which read operations were performed on this volume.

Number

Compare the value of this measure across volumes to know which volume is too slow in processing read requests.

Volume total read time

Indicates the total time taken by all completed read operations.

Seconds

A very high value for this measure could indicate that the volume took too long to service one/more read requests.

Volume total write time

Indicates the total time taken by all completed write operations.

Seconds

A very high value for this measure could indicate that the volume took too long to service one/more write requests.

Volume write data

Indicates the rate at which data was written to this volume.

MB

Compare the value of this measure to identify the volume that is the slowest in responding to write requests.

Volume write operations

Indicates the rate at which write operations were performed on this volume.

Number

Compare the value of this measure across volumes to know which volume is too slow in processing write requests.