Etcd Snap Database Test

An Etcd snap database is a point-in-time backup of the etcd key-value store, capturing its consistent state, including all keys and values. It’s essential for disaster recovery, enabling cluster restoration in case of failure or data loss. Snapshots are efficient, compressed, and incremental, ensuring reliable and storage-efficient backups of the etcd database.

Monitoring etcd snap database is crucial for ensuring data integrity and quick recovery in case of failure. Regularly monitoring snapshots helps detect potential issues, such as storage capacity problems or inconsistent backups. It also ensures timely and efficient disaster recovery, minimizing downtime and data loss for critical Kubernetes cluster components.

The Etcd Snap Database Test continuously monitors the Etcd Snap Database and reports the key metrics. Through the analysis of these metrics administrators can identify if there are any issues with system.

Target of the test : A Kubernetes Master Node

Agent deploying the test : A remote agent

Outputs of the test : One set of results for the target Kubernetes master node being monitored

Configurable parameters for the test

Parameter

Description

Test Period

How often should the test be executed.

Host

The IP address of the host for which this test is to be configured.

Port

Specify the port at which the specified Host listens. By default, this is 6443.

Timeout

Specify the duration (in seconds) beyond which the test will timeout in the Timeout text box. The default value is 10 seconds.

Metric URL

Each of the Kubernetes system components expose monitoring metrics through /metrics endpoint of the HTTP server. For components that don't expose endpoint by default, refer official Kubernetes distribution documentation site. Specify the metric URL textbox.

Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Total latency distributions of fsyncing .snap.db file

Indicates the number of different latency distributions available for fsyncing snap db file.

Number

If the number of latency distributions is increasing over the measurements, this might be cause of concern.

Time spent in latency distributions of fsyncing .snap.db file

Indicates the time spent in latency distributions of fsyncing .snap.db file.

Seconds

If the time spent in latency distributions is increasing over the measurements, this might be cause of concern.

Total latency distributions of v3 snapshot

Indicates the number of latency distributions for creation of v3 snapshots.

Number

If the number of latency distributions is increasing over the measurements, this might be cause of concern.

Time spent in latency distributions of v3 snapshots

Indicates the total time spent in latency distributions of v3 snapshots.

Seconds

If the time spent in latency distributions is increasing over the measurements, this might be cause of concern.

Total latency distributions of fsync called by snap

Indicates the total latency distributions of fsync called by snap.

Number

If the number of latency distributions is increasing over the measurements, this might be cause of concern.

Time spent in latency distributions of fsync called by snap

Indicates the total time spent in latency distributions of fsync called by snap.

Seconds

If the time spent in latency distributions is increasing over the measurements, this might be cause of concern.