Flume Sink Test
Apache Flume has three main components - Source, Channel and Sink. A sink stores the data in centralized stores like HBase and HDFS. It consumes the data (events) from the channels and delivers it to the destination. The destination of the sink might be another agent or the central stores. One of the example of sink is the HDFS sink.
Given that the sink is responsible for storing the data in a centralized store for long-term storage and processing, it could lead to significant data loss if the sink malfunctions or is unavailable. That is the reason it is absolutely important to monitor the sink to fully capture its operations and highlight if there is an issue or error because of which data loss may occur. The metrics and insights from monitoring can help administrators identify and act on potential problems even before they propagate into failure.
This test monitors every Flume sink and collects key metrics like number of batches complete, channel reads fails, connections closed etc. These metrics help administrators understand the current performance of the system and alerts when it requires intervention to fix the problems.
Target of the test : Apache Flume
Agent deploying the test : An internal agent
Outputs of the test : One set of results for each sink in Apache Flume agent being monitored.
Parameter | Description |
---|---|
Test period |
How often should the test be executed. |
Host |
The IP address of the target server that is being monitored. |
Port |
The port number through which the Apache Flume communicates. The default port is 8080. |
FLUME JMX Remote Port |
Specify the port at which the JMX listens for requests from remote hosts. Ensure that you specify the same port that you configured in theflume-env.ps1file, in JVM_OPTS variable. |
JMX Username, Password and Confirm Password |
These parameters appear only if the Mode is set to JMX. If JMX requires authentication only (but no security), then ensure that the user and password parameters are configured with the credentials of a user with read-write access to JMX. To know how to create this user, refer to Configuring the eG Agent to Support JMX Authentication. Confirm the password by retyping it in the Confirm Password text box. |
Measurement | Description | Measurement Unit | Interpretation |
---|---|---|---|
Batch complete count |
Indicates the number of batches of events received in this sink with size equal to maximum batch size in a second. |
Batches/Sec |
Having all batches full to the capacity is best use of Flume resources, but can only be achieved for high volume systems. |
Batch empty count |
Indicates the number of batches received in a second in this sink with no events in the batch. |
Batches/Sec |
This is not ideal as batches are processed but there is no data. |
Batch under flow count |
Indicates the number of batches received in a second in this sink with number of events less than maximum batch size. |
Batches/Sec |
This approach is optimal approach with systems with low amount of data being transferred. |
Channel read fail |
Indicates the number of events which this sink failed to read from the channel. |
Events/Sec |
Administrators should investigate if channel and sink are both in healthy state and should go through the logs to understand why the read is failing. |
Connection closed count |
Indicates the number of existing connections closed in a second from this sink to storage system or next hop. |
Connections/Sec |
If the number of connections closed is proportionate to the reduction in number of events then it is fine, otherwise there could be some issue which might be causing the connections to close and remaining connections may not be able to service the event flow. |
Connection created count |
Indicates the number of new connections created in a sec by this sink to storage system or next hop. |
Connections/Sec |
If the number of connections created is proportionate to the number of events then it is fine otherwise administrators may need to investigate. |
Connection failed count |
Indicates the number of connection requests from this sink to storage system, that failed. |
Connection/Sec |
|
Event drain attempt count |
Indicates the number of events that this sink tried to write to the storage system in a second. |
Events/Sec |
|
Event drain success count |
Indicates the number of events that this sink successfully wrote to the storage system in a second. |
Events/Sec |
|
Event write fail |
Indicates the number of events that this sink tried to write to the storage system in a second but failed. |
Events/Sec |
|