AWS EC2 Instance Uptime Test

In cloud-based environments, it is essential to monitor the uptime of server instances launched on the cloud. By tracking the uptime of each of the instances, administrators can determine what percentage of time an instance has been up. Comparing this value with service level targets, administrators can determine the most trouble-prone areas of the infrastructure hosted on the cloud.

In some environments, administrators may schedule periodic reboots of their instances. By knowing that a specific instance has been up for an unusually long time, an administrator may come to know that the scheduled reboot task is not working on an instance.

This test monitors the uptime of each instance available to the configured AWS user account.

Target of the test: Amazon Cloud

Agent deploying the test: A remote agent

Output of the test: One set of results for each instancename:instanceID available for the configured AWS user account

Configurable parameters for the test
Parameter	Description
Test Period	How often should the test be executed.
Host	The host for which the test is to be configured.
Access Type	eG Enterprise monitors the AWS cloud using AWS API. By default, the eG agent accesses the AWS API using a valid AWS account ID, which is assigned a special role that is specifically created for monitoring purposes. Accordingly, the Access Type parameter is set to Role by default. Furthermore, to enable the eG agent to use this default access approach, you will have to configure the eG tests with a valid AWS Account ID to Monitor and the special AWS Role Name you created for monitoring purposes. Alternately, some AWS cloud environment administrators may not want to share their AWS Account ID. In this case, the eG agent can access the AWS API using a Managed Identity (trusted node) based approach. In this approach, you install the eG agent on a trusted node - i.e., an EC2 instance on the target AWS Cloud, assign IAM Roles to that EC2 instance for secure access, manage the AWS Cloud using the agent installed on that EC2 instance and collect the required metrics. To use this approach, you can change the Access Type to Managed Identity. Some AWS cloud environments however, may not support the role-based approach or managed identity based approach. Instead, they may allow cloud API requests only if such requests are signed by a valid Access Key and Secret Key. When monitoring such a cloud environment therefore, you should change the Access Type to Secret. Then, you should configure the eG tests with a valid AWS Access Key and AWS Secret Key. Note that the Secret option may not be ideal when monitoring high-security cloud environments. This is because, such environments may issue a security mandate, which would require administrators to change the Access Key and Secret Key, often. Because of the dynamicity of the key-based approach, Amazon recommends the Role-based approach for accessing the AWS API.
AWS Account ID to Monitor	This parameter appears only when the Access Type parameter is set to Role. Specify the AWS Account ID that the eG agent should use for connecting and making requests to the AWS API. To determine your AWS Account ID, follow the steps below: Login to the AWS management console. with your credentials. Click on your IAM user/role on the top right corner of the AWS Console. You will see a drop-down menu containing the Account ID (see Figure 1). Figure 1 : Identifying the AWS Account ID
AWS Role Name	This parameter appears when the Access Type parameter is set to Role. Specify the name of the role that you have specifically created on the AWS cloud for monitoring purposes. The eG agent uses this role and the configured Account ID to connect to the AWS Cloud and pull the required metrics. To know how to create such a role, refer to Creating a New Role.
AWS Access Key, AWS Secret Key, Confirm AWS Access Key, Confirm AWS Secret Key	These parameters appear only when the Access Type parameter is set to Secret.To monitor an Amazon cloud instance using the Secret approach, the eG agent has to be configured with the access key and secret key of a user with a valid AWS account. For this purpose, we recommend that you create a special user on the AWS cloud, obtain the access and secret keys of this user, and configure this test with these keys. The procedure for this has been detailed in the Obtaining an Access key and Secret key topic. Make sure you reconfirm the access and secret keys you provide here by retyping it in the corresponding Confirm text boxes.
Proxy Host and Proxy Port	In some environments, all communication with the AWS cloud and its regions could be routed through a proxy server. In such environments, you should make sure that the eG agent connects to the cloud via the proxy server and collects metrics. To enable metrics collection via a proxy, specify the IP address of the proxy server and the port at which the server listens against the Proxy Host and Proxy Port parameters. By default, these parameters are set to none , indicating that the eG agent is not configured to communicate via a proxy, by default.
Proxy User Name, Proxy Password, and Confirm Password	If the proxy server requires authentication, then, specify a valid proxy user name and password in the proxy user name and proxy password parameters, respectively. Then, confirm the password by retyping it in the CONFIRM PASSWORD text box. By default, these parameters are set to none, indicating that the proxy sever does not require authentication by default.
Proxy Domain and Proxy Workstation	If a Windows NTLM proxy is to be configured for use, then additionally, you will have to configure the Windows domain name and the Windows workstation name required for the same against the proxy domain and proxy workstation parameters. If the environment does not support a Windows NTLM proxy, set these parameters to none.
Exclude Region	Here, you can provide a comma-separated list of region names or patterns of region names that you do not want to monitor. For instance, to exclude regions with names that contain 'east' and 'west' from monitoring, your specification should be: east,west
Cloudwatch Enabled	This parameter only applies to the AWS - Aggregated Resource Usage test. This test reports critical metrics pertaining to the resource usage of the server instances launched in the cloud. If you want this test to report resource usage metrics very frequently - say, once every minute or lesser - you will have to configure the tests to use the AWS CloudWatchservice. This is a paidweb service that enables you to monitor, manage, and publish various metrics, as well as configure alarm actions based on data from metrics. To enable is test to use this service, set the CloudWatch Enabled flag to Yes. On the other hand, to report resource usage metrics less frequently - say, once in 5 minutes or more - this test does not require the AWS CloudWatchservice; in this case therefore, set the cloudwatch enabled flag to No. Note that for enabling CloudWatch, you will have to pay CloudWatch fees. For the fee details, refer to the AWS web site.
Exclude Instance	This parameter applies only to AWS- Instance Connectivity, AWS- Instance Resources , and AWS- Instance Uptime tests. In the Exclude Instance text box, provide a comma-separated list of instance names or instance name patterns that you do not wish to monitor. For example: i-b0c3e,7dbe56d. By default, this parameter is set to none.
Report Manager Time	By default, this flag is set to Yes, indicating that, by default, the detailed diagnosis of this test, if enabled, will report the shutdown and reboot times of the cloud in the manager’s time zone. If this flag is set to No, then the shutdown and reboot times are shown in the time zone of the system where the agent is running (i.e., the system system on which the remote agent is running).
DD Frequency	Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.
Detailed Diagnosis	To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled: The eG manager license should allow the detailed diagnosis capability Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.

Measures reported by the test:

Measurement

Description

Measurement Unit

Interpretation

Has the instance been rebooted?

Indicates whether this instance has been rebooted during the last measurement period or not.

Boolean

If this measure shows 1, it means that the instance was rebooted during the last measurement period. By checking the time periods when this metric changes from 0 to 1, an administrator can determine the times when this instance was rebooted.

Uptime of the instance during the last measure period

Indicates the time period that the instance has been up since the last time this test ran.

Secs

If the instance has not been rebooted during the last measurement period and the agent has been running continuously, this value will be equal to the measurement period. If the instance was rebooted during the last measurement period, this value will be less than the measurement period of the test. For example, if the measurement period is 300 secs, and if the instance was rebooted 120 secs back, this metric will report a value of 120 seconds. The accuracy of this metric is dependent on the measurement period - the smaller the measurement period, greater the accuracy.

Total uptime of the instance

Indicates the total time that this instance has been up since its last reboot.

Mins

Administrators may wish to be alerted if an instance has been running without a reboot for a very long period. Setting a threshold for this metric allows administrators to determine such conditions.

Is under maintenance?

Indicates whether this instance is under maintenance or not.

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure Value	Numeric Value
Yes	1
No	0

Note:

This measure reports the Measure Values listed in the table above to indicate whether the target instance is under maintenance. However, in the graph, this measure is indicated using the Numeric Values listed in the table above.