AWS API Gateway Test
Amazon API Gateway is an AWS service for creating, publishing, maintaining, monitoring, and securing REST, HTTP, and WebSocket APIs at any scale. API developers can create APIs that access AWS or other web services, as well as data stored in the AWS Cloud. As an API Gateway API developer, you can create APIs for use in your own client applications. Or you can make your APIs available to third-party app developers.
To invoke an API, a URL is used. This URL includes an API stage, which is a logical reference to a lifecycle state of your API (for example, dev, prod, beta, or v2). API stages are identified by their API ID and stage name. Each stage is a named reference to a deployment of the API and is made available for client applications to call.
If an API created using the API Gateway service throws many errors or is sluggish in responding to client requests, it will leave the corresponding application users unhappy. To improve user experience with the APIs they create, API / app developers should be able to promptly detect errors/latencies in the APIs, and resolve them before the application users notice and complain. This is now possible using the AWS API Gateway test.
By default, this test monitors each API that is created using the API Gateway service, and reports the errors and latencies observed in each API. This way, API developers can figure out if any of the APIs they have created are experiencing issues. To know which specific deployment of the API is troublesome, you can optionally configure this test to report metrics by API_ID:Stage, instead of just the API ID.
Additionally, the test provides diagnostics on API cache usage, so that API developers can figure out if the poor responsiveness is owing to improper cache configuration.
Target of the test: Amazon Cloud
Agent deploying the test: A remote agent
Output of the test: One set of results for each API created in every AWS region in the configured AWS user account.
Parameter | Description |
---|---|
Test Period |
How often should the test be executed. |
Host |
The host for which the test is to be configured. |
Access Type |
eG Enterprise monitors the AWS cloud using AWS API. By default, the eG agent accesses the AWS API using a valid AWS account ID, which is assigned a special role that is specifically created for monitoring purposes. Accordingly, the Access Type parameter is set to Role by default. Furthermore, to enable the eG agent to use this default access approach, you will have to configure the eG tests with a valid AWS Account ID to Monitor and the special AWS Role Name you created for monitoring purposes. Some AWS cloud environments however, may not support the role-based approach. Instead, they may allow cloud API requests only if such requests are signed by a valid Access Key and Secret Key. When monitoring such a cloud environment therefore, you should change the Access Type to Secret. Then, you should configure the eG tests with a valid AWS Access Key and AWS Secret Key. Note that the Secret option may not be ideal when monitoring high-security cloud environments. This is because, such environments may issue a security mandate, which would require administrators to change the Access Key and Secret Key, often. Because of the dynamicity of the key-based approach, Amazon recommends the Role-based approach for accessing the AWS API. |
AWS Account ID to Monitor |
This parameter appears only when the Access Type parameter is set to Role. Specify the AWS Account ID that the eG agent should use for connecting and making requests to the AWS API. To determine your AWS Account ID, follow the steps below:
|
AWS Role Name |
This parameter appears when the Access Type parameter is set to Role. Specify the name of the role that you have specifically created on the AWS cloud for monitoring purposes. The eG agent uses this role and the configured Account ID to connect to the AWS Cloud and pull the required metrics. To know how to create such a role, refer to Creating a New Role. |
AWS Access Key, AWS Secret Key, Confirm AWS Access Key, Confirm AWS Secret Key |
These parameters appear only when the Access Type parameter is set to Secret.To monitor an Amazon instance, the eG agent has to be configured with the access key and secret key of a user with a valid AWS account. For this purpose, we recommend that you create a special user on the AWS cloud, obtain the access and secret keys of this user, and configure this test with these keys. The procedure for this has been detailed in the Obtaining an Access key and Secret key topic. Make sure you reconfirm the access and secret keys you provide here by retyping it in the corresponding Confirm text boxes. |
Proxy Host and Proxy Port |
In some environments, all communication with the AWS cloud and its regions could be routed through a proxy server. In such environments, you should make sure that the eG agent connects to the cloud via the proxy server and collects metrics. To enable metrics collection via a proxy, specify the IP address of the proxy server and the port at which the server listens against the Proxy Host and Proxy Port parameters. By default, these parameters are set to none , indicating that the eG agent is not configured to communicate via a proxy, by default. |
Proxy User Name, Proxy Password, and Confirm Password |
If the proxy server requires authentication, then, specify a valid proxy user name and password in the proxy user name and proxy password parameters, respectively. Then, confirm the password by retyping it in the CONFIRM PASSWORD text box. By default, these parameters are set to none, indicating that the proxy sever does not require authentication by default. |
Proxy Domain and Proxy Workstation |
If a Windows NTLM proxy is to be configured for use, then additionally, you will have to configure the Windows domain name and the Windows workstation name required for the same against the proxy domain and proxy workstation parameters. If the environment does not support a Windows NTLM proxy, set these parameters to none. |
Exclude Region |
Here, you can provide a comma-separated list of region names or patterns of region names that you do not want to monitor. For instance, to exclude regions with names that contain 'east' and 'west' from |
Exclude Instance |
This parameter applies only to AWS- Instance Connectivity, AWS- Instance Resources , and AWS- Instance Uptime tests. In the Exclude Instance text box, provide a comma-separated list of instance names or instance name patterns that you do not wish to monitor. For example: i-b0c3e*,*7dbe56d. By default, this parameter is set to none. |
Measurement |
Description |
Measurement Unit |
Interpretation |
||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4xx errors |
Indicates the number of client-side errors in this API / stage. For the Summary descriptor, this measure reports the total number of client-side errors encountered across APIs/stages. |
Number |
Errors in the range of 400 to 499 usually point to a problem with the API client. Some of the common client-side errors encountered by the API gateway include the following:
Ideally, the value of this measure should be 0. A high value indicates that many client-side errors occurred. In this case, compare the value of this measure across descriptors to identify the exact API/stage that is experiencing the maximum number of client-side errors. Check out the CloudWatch Dashboard or CloudWatch logs to view the errors and figure out how to resolve them. |
||||||||||||
5xx errors |
Indicates the number of server-side errors in this API / stage. For the Summary descriptor, this measure reports the total number of server-side errors encountered across APIs/stages. |
Number |
Errors in the range of 500 to 599 mean something on the server is wrong. Some of the common client-side errors encountered by the API gateway include the following:
Ideally, the value of this measure should be 0. A high value indicates that many server-side errors occurred. In this case, compare the value of this measure across descriptors to identify the exact API/stage that is experiencing the maximum number of server-side errors. Check out the CloudWatch Dashboard or CloudWatch logs to view the errors and figure out how to resolve them. |
||||||||||||
Requests served from API cache |
Indicates the number of requests to this API/stage that were serviced from the API cache. For the Summary descriptor, this measure reports the total number of requests, across APIs/stages, that were served by the API cache. |
Number |
You can enable API caching in Amazon API Gateway to cache your endpoint's responses. With caching, you can reduce the number of calls made to your endpoint and also improve the latency of requests to your API. If the value of this measure is much higher than that of the Requests served from the backend measure, it means that caching has effectively minimized calls to endpoints. On the other hand, if the value of this measure is significantly lower than that of the Requests served from the backend measure, it is indicative of poor caching. One of the common causes of ineffective cache usage is improper configuration of the cache capacity. When you enable caching, you must choose a cache capacity. If the cache is under-sized - i.e., if the cache capacity chosen is not commensurate to the workload of the API - then the API cache will not be able to service many API requests. A majority of the requests will hence be routed to directly to the backend for processing. As a result, latencies will increase. To avoid this, you may want to consider right-sizing the cache if you observe that the value of this measure is alarmingly lower than the value of the Requests served from backend measure. |
||||||||||||
Requests served from backend |
Indicates the number of requests to this API/stage that were serviced from the backend and not the cache. For the Summary descriptor, this measure reports the total number of requests, across APIs/stages, that were served by the backend. |
Number |
You can enable API caching in Amazon API Gateway to cache your endpoint's responses. With caching, you can reduce the number of calls made to your endpoint and also improve the latency of requests to your API. If the value of this measure is much lower than that of the Requests served from API cache measure, it means that caching has effectively minimized calls to backend. On the other hand, if the value of this measure is significantly lower than that of the Requests served from API cache measure, it is indicative of poor caching. One of the common causes of ineffective cache usage is improper configuration of the cache capacity. When you enable caching, you must choose a cache capacity. If the cache is under-sized - i.e., if the cache capacity chosen is not commensurate to the workload of the API - then the API cache will not be able to service many API requests. A majority of the requests will hence be routed to directly to the backend for processing. As a result, latencies will increase. To avoid this, you may want to consider right-sizing the cache if you observe that the value of this measure is alarmingly higher than the value of the Requests served from API cache measure. |
||||||||||||
Total API requests |
Indicates the total number of requests received by this API/stage. For the Summary descriptor, this measure reports the total number of requests received by the API Gateway service, across APIs/stages. |
Number |
This is a good indicator of the workload of an API/stage. |
||||||||||||
Integration latency |
Indicates the time between when this API/stage relays a request to the backend and when it receives a response from the backend. |
Seconds |
Ideally, the value of this measure should be low. A consistent increase in this value could be indicative of processing bottlenecks in the backend. Compare the value of this measure across APIs/stages to know which API/stage is receiving responses very slowly from the backend. |
||||||||||||
Latency |
Indicates the time between when this API/stage receives a request from a client and when it returns a response to the client. |
Seconds |
Ideally, the value of this measure should be low. A consistent increase in this value could be indicative of latencies in an API/stage. Compare the value of this measure across APIs/stages to know which API/stage is the most lethargic when processing requests. To diagnose what is causing this slowness, first compute the difference between the values of the Integration latency and Latency measures for the problematic API/stage. This difference represents other latencies that are impacting the responsiveness of the API/stage. Now, compare this difference with the value of the Integration Latency measure to determine if the slowness is owing to a latent backend or other processing latencies. If a latent backend is impacting the responsiveness of an API/stage, you may want to enable API caching and adequately size the cache. This will minimize backend processing and optimize cache processing, which will eventually improve latencies. |