ApsaraDB for Redis Test

ApsaraDB for Redis is a database service that is compatible with native Redis protocols. It supports a hybrid of memory and hard disks for data persistence. ApsaraDB for Redis provides a highly available hot standby architecture and can scale to meet requirements for high-performance and low-latency read/write operations.

To ensure that every Redis instance in use on the cloud delivers on its promise of high-performance at all times, administrators must continuously monitor the status, resource usage, query processing ability, and overall health of every instance, swoop down on potential issues, and eliminate them before they impact user experience with that instance. This is where the ApsaraDB for Redis test helps!

This test auto-discovers the Redis instances that have been configured, and reports the status of each instance. This way, the test points administrators to inactive, unavailable, and error-prone instances. Additionally, the test measures how quickly an instance processes queries, and in the process, sheds light on probable bottlenecks in query processing. The CPU, memory, connection and bandwidth usage of each instance is also monitored, so that instances experiencing serious resource contentions can be identified quickly. If the contention persists, then administrators can consider resizing the instances to resolve it. This way, the test helps accurately isolate problematic instances, so that administrators can quickly fix those problems and ensure that the critical database service is uninterrupted.

Target of the test : An Alibaba Cloud Account

Agent deploying the test : A remote agent

Outputs of the test : One set of results for every RDS instance for MySQL

Configurable parameters for the test
Parameters	Description
Test period	How often should the test be executed
Host	The host for which the test is to be configured.
Alibaba Access Key and Alibaba Secret Key	This test makes REST API requests to the Alibaba cloud to pull the metrics. For this purpose, the test needs to be configured with an AccessKey pair. An AccessKey pair is typically used to call an operation of an Alibaba Cloud service. It is also used to initiate an API request or use a cloud service SDK to manager cloud resources. An AccessKey pair is characterized by an AccessKey ID and an AccessKey Secret. The AccessKey ID is used to identify a user/cloud account. The AccessKey Secret is used to verify a user/cloud account. The first step to configuring the eG agent with an AccessKey pair is to create an AccessKey pair for the target cloud acount. To achieve this, follow the steps below: Log on to the RAM console by using an Alibaba Cloud account. In the left-side navigation pane, click Users under Identities. On the Users page, click the username of the RAM user for which you want to create an AccessKey pair in the User Logon Name/Display Name column. On the page that appears, click Create AccessKey in the User AccessKeys section. Note: You must enter a verification code if you create an AccessKey pair for the first time. Click Close. Note: The AccessKey secret is displayed only when you create an AccessKey pair. If the AccessKey pair is leaked or lost, you must create a new one. You can create a maximum of two AccessKey pairs. Make note of the AccessKey ID and AccessKey secret, once they are displayed. Then, configure the Alibaba Access Key parameter of the test with the AccessKey ID, and the Alibaba Secret Key parameter with the AccessKey Secret you made note of. If you failed to make note of the AccessKey ID and AccessKey Secret at the time of creating the AccessKey pair, then you can obtain the same at a later point in time. Similarly, if an AccessKey pair pre-exists for the target cloud account, then you do not have to create another one. Instead, you can obtain the AccessKey ID and AccessKey Secret of the existing AccessKey pair and configure the eG agent with the same. For this, follow the steps below: Use an Alibaba Cloud account to log on to the Alibaba Cloud Management console. Move the pointer over the profile picture in the upper-right corner, and click AccessKey. In the Security Tips message that appears, click Continue to manage AccessKey. AccessKey ID and AccessKey Secret are displayed. Make note of the displayed ID and secret. Then, configure the Alibaba Access Key parameter of the test with the AccessKey ID, and the Alibaba Secret Key parameter with the AccessKey Secret you made note of.
Detailed Diagnosis	To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled: The eG manager license should allow the detailed diagnosis capability Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.

Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Status

Indicates the current status of this instance.

The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value	Numeric Value
Normal	1
Creating	2
Changing	3
Flushing	4
Transforming	5
BackupRecovering	6
MinorVersionUpgrading	7
NetworkModifying	8
SSLModifying	9
MajorVersionUpgrading	10
Released	11
Inactive	12
Unavailable	13
Error	14

The Measure Values discussed in the table are described in detail below:

Normal: The instance runs as expected
Creating: The instance is being created.
Changing: The configurations of the instance is being changed.
Inactive: The instance is disabled.
Flushing: The data of the instance is being flushed.
Released: The instance is released.
Transforming: The instance is being transformed.
Unavailable: The service is unavailable.
Error: Failed to create the instance.
Migrating: The instance is being migrated.
BackupRecovering: The instance is being backed up or restored.
MinorVersionUpgrading: The minor version is being upgraded.
NetworkModifying: The network is being changed.
SSLModifying: The SSL feature is being changed.
MajorVersionUpgrading: The major version is being upgraded and the service is available.

Note:

This measure reports the Measure Values listed in the table above to indicate the current state of an Redis instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Use the detailed diagnosis of this measure to view the complete details of the Redis instance.

Is RDS?

Indicates whether the instance is managed by Relational Database Service (RDS).

The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value	Numeric Value
True	1
False	0

Note:

This measure reports the Measure Values listed in the table above to indicate whether/not the target instance is managed by RDS. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Has renewal change order?

Indicates whether there was an order of renewal with configuration change that had not taken effect.

After a subscription instance expires, you must renew the instance within days after the expiration to continue the use of the instance. To avoid service interruption caused by an expired subscription, we recommend that you manually renew the instance or enable auto-renewal before the instance expires.

You can change the specifications of a subscription instance before or after the instance expires. Higher specifications are charged more than lower specifications. For example, the price of an 8 GB read/write splitting instance with 5 read replicas is higher than that of a 16 GB cluster instance. If you want to change a 16 GB cluster instance to an 8 GB read/write splitting instance with 5 read replicas, you must upgrade the instance.

If the specification change you initiated at the time of renewal is not effected, then the value of this measure will be False. If the change is effected, then this measure will report the value True.

The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value	Numeric Value
True	1
False	0

Note:

This measure reports the Measure Values listed in the table above to indicate whether/not the configuration change initiated at the time of instance renewal has been applied. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Storage capacity

Indicates the storage capacity of this instnace.

Average query rate

Indicates the rate at which this instance processes queries.

Queries/Sec

A high value is desired for this measure. A low value signifies slowness in query processing. Compare the value of this measure across RDS instances to know which instance is processing queries slowly.

Maximum bandwidth

Indicates the maximum bandwidth that this instance can support.

MB/Sec

If network resources are sufficient, the bandwidth is unlimited for ApsaraDB for Redis instances. However, if network resources are insufficient, the maximum bandwidth takes effect for the instances.

Maximum connections

Indicates the maximum number of connections that this instance can support.

Number

Account status

Indicates the current status of the account of this instance.

The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value	Numeric Value
Available	1
Unavailable	2

Note:

This measure reports the Measure Values listed in the table above to indicate the account status of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Account type

Indicates the account type of this instance.

The values that this measure can report and their corresponding numeric values are discussed in the table below:

Measure Value	Numeric Value
Normal	1
Super	2

Note:

This measure reports the Measure Values listed in the table above to indicate the account type of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Account privilege

Indicates the permissions of this instance's account.

The values that this measure can report , their descriptions, and their corresponding numeric values are discussed in the table below:

Measure Value	Description	Numeric Value
RoleReadOnly	This account has read-only permissions.	1
RoleReadWrite	This account has read and write permissions.	2
RoleRepl	This account has replication permissions.	3

Note:

This measure reports the Measure Values listed in the table above to indicate the account permissions of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Backup status

Indicates the status of this instance's backup.

The values that this measure can report , their descriptions, and their corresponding numeric values are discussed in the table below:

Measure Value	Numeric Value
Success	1
Failed	0

Note:

This measure reports the Measure Values listed in the table above to indicate the backup status of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Use the detailed diagnosis of this measure to view the status of each backup of this instance. This way, you can identify the backup that failed.

Memory usage

Indicates the percentage of memory used by this instance.

Percent

If the value of this measure is close to 100% for any instance, it implies that that instance is running out of memory. You may want to consider resizing such instances, so as to avoid the memory contention.

Connection usage

Indicates the percentage of connections used by this instance.

Percent

If the value of this measure is close to 100% for any instance, it means that that instance is about to reach its connection limit. Once the limit is reached, the instance will not be able to entertain any new connections. To avoid this unpleasant outcome, you may want to consider increasing the connection limit of the instance.

Bandwidth consumed during write operations

Indicates the percentage of bandwidth consumed by this instance when performing write operations.

Percent

If the value of this measure is close to 100%, it means that that instance has spent almost its entire bandwidth limit on write operations. Without adequate bandwidth resources, read operations may slow down. To avoid this, you may want to consider increasing the maximum bandwidth that the instance can use.

Bandwidth consumed during read operations

Indicates the percentage of bandwidth consumed by this instance when performing read operations.

Percent

If the value of this measure is close to 100%, it means that that instance has spent almost its entire bandwidth limit on read operations. Without adequate bandwidth resources, write operations may slow down. To avoid this, you may want to consider increasing the maximum bandwidth that the instance can use.

Write speed

Indicates the intranet write speed of this instance.

KB/Sec

Ideally, the value of this measure should be high. A low value indicates that the instance is slow in performing write operations over the intranet.

Read speed

Indicates the intranet read speed of this instance.

KB/Sec

Ideally, the value of this measure should be high. A low value indicates that the instance is slow in performing read operations over the intranet.

Failed operations on KVSTORE

Indicates the count of operations that failed on this instance's KVStore.

Number

Redis is an in-memory non-relational key-value store (KVStore). This means that it stores data based on keys and values — think of it as a giant dictionary that uses words and their definitions to store information. The keys (or words) are required in order to retrieve their values (definitions).

Ideally, the value of this measure should be 0. A non-zero value indicates that one/more operations have failed on this instance's key-value store. This can be detrimental to the health of the instance.

CPU usage

Indicates the percentage of CPU resources used by this instance.

Percent

If the value of this measure is close to 100% for any instance, it implies that that instance is consuming CPU resources excessively. You may want to consider resizing such instances, so as to avoid a CPU contention.

Used memory

Indicates the amount of memory currently used by this instance.

Compare the value of this measure across instances to identify the instance that is consuming maximum memory.

Used connections

Indicates the count of connections currently in use for this instance.

Number

Compare the value of this measure across instances to identify the instance that is supporting the maximum number of connections.

For such an instance, compare the value of this measure with that of the Maximum connections measure to figure out if that instance is about to reach its connection limit. If so, then consider increasing the connection limit of that instance, so as to avoid unnecessary contention for connections.

Total QPS

Indicates the number of queries that this instance executes every second.

Number

Compare the value of this measure across instances to know which instance is running the maximum number of queries (each second). For such an instance, then compare the value of this measure with the maximum QPS configured for that instance to understand whether that instance is capable of running more queries, or is about to exhaust its query processing power. In the case of the latter, you may want to increase the QPS of the instance to make sure that the instance continues to process queries without a glitch.