ApsaraDB for RDS Test
ApsaraDB for RDS is a stable, reliable, and scalable online database service. Based on Apsara Distributed File System and high-performance SSD storage of Alibaba Cloud, ApsaraDB for RDS supports the MySQL, SQL Server, PostgreSQL, PPAS (highly compatible with Oracle), and MariaDB database engines. It provides a portfolio of solutions for disaster recovery, backup, restoration, monitoring, and migration to facilitate database operations and maintenance.
The first step to using RDS is to create an RDS instance. An instance is a virtualized database server on which you can create and manage multiple databases. If a cloud user complains that he/she is unable to access their database on an RDS instance, administrators need to quickly figure out why it is so - is it because the instance hosting the database is down? is the instance rebooting? is the instance being deleted? or is the instance being locked? Moreover, the administrator also needs to ensure that each instance is sized with adequate CPU, memory, network, and storage resources, so that no instance experiences any performance degradation. If it does, then administrators should be able to identify the resource-starved instances and right-size them, before users notice any slowness. The ApsaraDB for RDS test helps with this and much more!
This test tracks the availability, operational state, and lock mode of every RDS instance, and alerts administrators to unavailable instances, those that are in an abnormal state currently, and locked instances. Additionally, the test reports the CPU, memory, connection, disk space, and I/O capacity of each instance, and also measures how every instance uses the allocated capacity. In the process, the test pinpoints which instance is hogging which resource. With the help of these diagnostics, administrators can proactively identify and promptly eliminate issues hampering the overall performance of and user experience with the virtual database server instances.
Target of the test : An Alibaba Cloud Account
Agent deploying the test : A remote agent
Outputs of the test : One set of results for each RDS instance
Parameters | Description |
---|---|
Test period |
How often should the test be executed |
Host |
The host for which the test is to be configured. |
Alibaba Access Key and Alibaba Secret Key |
This test makes REST API requests to the Alibaba cloud to pull the metrics. For this purpose, the test needs to be configured with an AccessKey pair. An AccessKey pair is typically used to call an operation of an Alibaba Cloud service. It is also used to initiate an API request or use a cloud service SDK to manager cloud resources. An AccessKey pair is characterized by an AccessKey ID and an AccessKey Secret. The AccessKey ID is used to identify a user/cloud account. The AccessKey Secret is used to verify a user/cloud account. The first step to configuring the eG agent with an AccessKey pair is to create an AccessKey pair for the target cloud acount. To achieve this, follow the steps below:
If you failed to make note of the AccessKey ID and AccessKey Secret at the time of creating the AccessKey pair, then you can obtain the same at a later point in time. Similarly, if an AccessKey pair pre-exists for the target cloud account, then you do not have to create another one. Instead, you can obtain the AccessKey ID and AccessKey Secret of the existing AccessKey pair and configure the eG agent with the same. For this, follow the steps below:
|
Detailed Diagnosis |
To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:
|
Measurement | Description | Measurement Unit | Interpretation | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Instance status |
Indicates the current status of this RDS instance. |
|
The values that this measure can report and their corresponding numeric values are discussed in the table below:
The Measure Values discussed in the table are described in detail below:
Note: This measure reports the Measure Values listed in the table above to indicate the current state of an RDS instance. In the graph of this measure however, the same is indicated using the numeric equivalents only. The detailed diagnosis of this measure reveals additional details of the RDS instance, such as, its type, version, the instance class, its port number, connection address, its network type, VPC, and the name of the zone to which it belongs. |
||||||||||||||||||||||||||||||||
Instance type |
Indicates the type of this instance. |
|
The values that this measure can report and their corresponding numeric values are discussed in the table below:
Note: This measure reports the Measure Values listed in the table above to indicate the role assigned to the RDS instance. In the graph of this measure however, the same is indicated using the numeric equivalents only. |
||||||||||||||||||||||||||||||||
Instance class type |
Indicates the instance family/class to which this instance belongs. |
|
The values that this measure can report and their corresponding numeric values are discussed in the table below:
The Measure Values discussed in the table are described in detail below:
Note: This measure reports the Measure Values listed in the table above to indicate the instance family. In the graph of this measure however, the same is indicated using the numeric equivalents only. |
||||||||||||||||||||||||||||||||
Lock mode |
Indicates the lock mode of this instance. |
|
The values that this measure can report and their corresponding numeric values are discussed in the table below:
The Measure Values discussed in the table are described in detail below:
Note: This measure reports the Measure Values listed in the table above to indicate the lock mode of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only. |
||||||||||||||||||||||||||||||||
Connection mode |
Indicates the access mode of this instance. |
|
The values that this measure can report and their corresponding numeric values are discussed in the table below:
Note: This measure reports the Measure Values listed in the table above to indicate the connection mode of an instance. In the graph of this measure however, the same is indicated using the numeric equivalents only. |
||||||||||||||||||||||||||||||||
Total memory |
Indicates the memory configuration of this instance. |
MB |
|
||||||||||||||||||||||||||||||||
Total capacity |
Indicates the total storage capacity of this instance. |
MB |
|
||||||||||||||||||||||||||||||||
Maximum database can be created |
Indicates the maximum number of databases that can be created on this instance. |
Number |
|
||||||||||||||||||||||||||||||||
Maximum account can be created |
Indicates the maximum number of accounts that can be created on this instance. |
Number |
|
||||||||||||||||||||||||||||||||
Availability |
Indicates whether/not this instance is available currently. |
Percent |
While the value 100 indicates that the instance is available, the value 0 denotes that the instance is unavailable. |
||||||||||||||||||||||||||||||||
Maximum I/O requests |
Indicates the maximum number of I/O requests this instance can process per second. |
Number |
|
||||||||||||||||||||||||||||||||
Maximum concurrent connections |
Indicates the maximum number of concurrent connections this instance can handle. |
Number |
|
||||||||||||||||||||||||||||||||
Total CPU |
Indicates the total number of CPU cores allocated to this instance. |
Number |
|
||||||||||||||||||||||||||||||||
Used space |
Indicates the amount of disk space that this instance is currently utilizing. |
MB |
Ideally, the value of this measure should be much lesser than the value of the Total storage measure. If this measure value is close to or is rapidly approaching the value of the Total storage measure, it implies that the instance is fast-exhausting its storage capacity. This can be detrimental to the performance of the instance. To prevent the storage crunch, you may want to configure the instance with additional storage space. Alternatively, you can compare the values of the Space occupied by data files, Space occupied by log files, Space occupied by backups, Space occupied by SQL data, and Cold backup data measures, to understand what type of data is consuming storage space. You can then see if data of any of these types can be deleted, so as to make more storage space available for critical data. |
||||||||||||||||||||||||||||||||
Space occupied by data files |
Indicates the amount of storage space of this instance that is occupied by data files. |
MB |
If the Percent usage measure of an instance is close to 100%, then you can compare the values of these measures for that instance to know what type of files is contributing to the storage crunch - data files? log files? backup files? SQL data files? or files in cold backup? |
||||||||||||||||||||||||||||||||
Space occupied by log files |
Indicates the amount of storage space of this instance that is occupied by log files. |
MB |
|||||||||||||||||||||||||||||||||
Space occupied by backups |
Indicates the amount of storage space of this instance that is occupied by backups. |
MB |
|||||||||||||||||||||||||||||||||
Space occupied by SQL data |
Indicates the amount of storage space of this instance that is occupied by SQL data. |
MB |
|||||||||||||||||||||||||||||||||
Cold backup size |
Indicates the amount of storage space of this instance that is occupied by cold backups. |
MB |
|||||||||||||||||||||||||||||||||
I/O requests rate |
Indicates the rate at which this instance processes I/O operations. |
Operations/Sec |
If the value of this measure is close to the value of the Maximum I/O requests measure for any instance, it means that the I/O load on that instance is very high. To ensure that the instance does not reject/drop I/O requests, you have to ensure that the instance has adequate processing power to meet with the demand - i.e., ensure that the instance has sufficient resources (CPU, memory, storage space etc.) - and then proceed to increase the limit set for the number of I/O requests that instance can process per second. |
||||||||||||||||||||||||||||||||
Average inbound traffic |
Indicates the average amount of data traffic flowing into this instance. |
KB |
|
||||||||||||||||||||||||||||||||
Average outbound traffic |
Indicates the average amount of data traffic flowing out of this instance. |
KB |
|
||||||||||||||||||||||||||||||||
Network throughput |
Indicates the network throughput of this instance. |
KB |
Compare the value of this measure across instances to identify the precise instance that is consuming bandwidth excessively. |
||||||||||||||||||||||||||||||||
Total current connections |
Indicates the current number of connections to this instnace. |
Number |
If the value of this measure is close to the Maximum concurrent connections measure for any instance, it implies that very soon the instance may not be able to entertain new connections. Under such circumstances, you may want to check to see if there are any idle connections to the instance and terminate them, so that the instance can handle more connections. The count of idle connections is the difference between the value of the Total current connections measure and the Total currently active connections measure. Alternatively, you can also increase the concurrent connection limit of the instance. |
||||||||||||||||||||||||||||||||
Total currently active connections |
Indicates the count of connections to this instance that are currently active. |
Number |
Ideally, the value of this measure should be equal to the Total current connections measure. If it is much lesser than the value of the Total current connections measure, it means many connections to the instance are idle/inactive. By identifying and removing such connections, you can increase the connection handling capacity of the instance. |
||||||||||||||||||||||||||||||||
Used memory |
Indicates the amount of memory currently used by this instance. |
MB |
Ideally, the value of this measure should be much lower than that of the Total memory measure. |
||||||||||||||||||||||||||||||||
Free memory |
Indicates the amount of memory that this instance is not using currently. |
MB |
For best performance, the value of this measure should be high. |
||||||||||||||||||||||||||||||||
Free space |
Indicates the amount of storage space that this instance is not using currently. |
MB |
For best performance, the value of this measure should be high. |
||||||||||||||||||||||||||||||||
CPU utilization |
Indicates the percentage of allocated CPU resources that is used by this instance. |
Percent |
A value close to 100% is a cause for concern, as it denotes that the instance is hogging the CPU resources. If the instance is a shared instance or a general-purpose instance, then excessive CPU utilization by that instance can cause the other instances on the same physical host to contend for the remaining CPU resources. In this case, you may want to increase the CPU capacity of the host. |
||||||||||||||||||||||||||||||||
Memory utilization |
Indicates the percentage of allocated memory resources that is used by this instance. |
Percent |
A value close to 100% is a cause for concern, as it denotes that the instance is rapidly running out of memory. Without enough memory, the instance may fail to service user requests to it. To avoid this, make sure you size the instance with sufficient memory resources. |
||||||||||||||||||||||||||||||||
Percent usage |
Indicates the percentage of allocated disk space that is used by this instance. |
Percent |
If the Percent usage measure of an instance is close to 100%, then you can compare the values of the Space occupied by data files, Space occupied by log files, Space occupied by backups, Space occupied by SQL data, and Cold backup data measures for that instance to know what type of files is contributing to the storage crunch - data files? log files? backup files? SQL data files? or files in cold backup? |
||||||||||||||||||||||||||||||||
IOPS utilization |
Indicates the percent of I/O resources that is used by this instance. |
Percent |
A value close to 100% is a cause for concern, as it denotes that the instance is rapidly approaching the I/O request limit configured for it. To ensure that the instance services I/O requests to it without rejecting them, you may want to consider increasing the maximum number of I/O requests that instance can handle. |
||||||||||||||||||||||||||||||||
Connection utilization |
Indicates the percent of connections used by this instance. |
Percent |
A value close to 100% is a cause for concern, as it implies that very soon the instance may not be able to entertain new connections. Under such circumstances, you may want to check to see if there are any idle connections to the instance and terminate them, so that the instance can handle more connections. The count of idle connections is the difference between the value of the Total current connections measure and the Total currently active connections measure. Alternatively, you can also increase the concurrent connection limit of the instance. |