What is AWS EC2?

Amazon EC2 logoAmazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers. Amazon EC2’s simple web service interface allows you to obtain and configure capacity with minimal friction. It provides you with complete control of your computing resources and lets you run on Amazon’s proven computing environment.

Amazon EC2 offers a broad and deep compute platform with a wide range of choices of processor, storage, networking, operating system, and purchasing models. EC2 also offers GPU enabled instances for machine learning training and graphics workloads. Typical workloads deployed include SAP, HPC, Machine Learning, and Windows workloads.

Most instances available are Virtual Machines (VMs) virtualized upon a Xen-based hypervisor, although Amazon have diversified with some compute VMs based upon Nitro (a flavor of KVM) and even some bare-metal instances. The EC2 web service allows organizations to rent computational resource and the associated infrastructure upon demand to scale up or down as needed and auto-scaling features of EC2 allow organizations to automatically adapt computing capacity to site traffic.

What is AWS CloudWatch?

Amazon EC2 CloudWatch logoAmazon CloudWatch is a monitoring and observability service originally built for DevOps engineers, developers, site reliability engineers (SREs), and IT managers. CloudWatch collects monitoring and operational data in the form of logs, metrics, and events from AWS resources, applications, and services that run on AWS and on-premises servers. With investment in staff skillsets and configuration, CloudWatch can be used to detect anomalous behavior in your environments, set alarms, visualize logs and metrics side by side, take automated actions, troubleshoot issues, and discover insights to keep applications running smoothly.

CloudWatch can be used to monitor more than 70 AWS services, such as Amazon EC2, Amazon DynamoDB, Amazon S3, Amazon ECS, Amazon EKS, and AWS Lambda. It automatically publishes 1-minute metrics and custom metrics with up to 1-second granularity. You can also use CloudWatch in hybrid cloud architectures by using the CloudWatch Agent or API to monitor your on-premises resources to some extent.

Monitoring AWS Elastic Compute Cloud (EC2) with AWS CloudWatch

CloudWatch is Amazon’s native cloud monitoring solution for AWS. AWS Cloud Services itself was initially instrumented for quicker service provisioning within Amazon by and for their developer community and was later extended to the public for similar purposes. Today, AWS is extensively used by businesses for their dev, staging, test and production needs, especially the Elastic Compute Cloud (EC2). As AWS’ initially focus was not concentrated on meeting production requirements, service provisioning and its management have evolved in a way that does still feel developer-led and some workflows feel slightly too complex and manual for IT administrators, particularly those from an EUC and/or on-premises background.

The default basic entry-level tier for CloudWatch is free but to monitor beyond this level you will need to move to the paid tiers that are priced on a pay-as-you-go (PAYG) basis associated with the number of metrics you sample and record and data volumes. It is highly likely that most organizations will need to use the paid tier.

What are the Limitations of the Basic Metrics about AWS EC2 that CloudWatch provides?

Amazon CloudWatch is basically a metrics repository. An AWS service—such as Amazon EC2—puts metrics into the repository, and you retrieve statistics based on those metrics. If you put your own custom metrics into the repository, you can retrieve statistics on these metrics as well.AWS CloudWatch architecture diagram

Figure 1: AWS CloudWatch architecture

The default metrics passed to and out of CloudWatch are similar to “hypervisor-based metrics” for VMs provided by VMware, Citrix, and others. It tells you what resources the VM is using (e.g. how much CPU?) but will not give details (e.g. which application processes are consuming CPU?).

The basic free tier includes:

  • Basic Monitoring Metrics (at 5-minute frequency)
  • 10 Detailed Monitoring Metrics (at 1-minute frequency)
  • 1 million API requests (not applicable to GetMetricData and GetMetricWidgetImage)
  • 3 dashboards for up to 50 metrics per month
  • Alarms, 10 alarm metrics (not applicable to high-resolution alarms)
  • Logs, 5GB data (ingestion, archive storage, and data scanned by Logs Insights queries)

Which metrics are collected under-the-hood is defined within a config.json file to monitor system-level details. The table below shows the default metrics collected.

Amazon Elastic Compute Cloud (EC2) metrics for each cloud instance (monitored agentless)

By default, the monitoring tab for each instance will show a dashboard available containing a subset of these metrics (see Figure 2). The default parameters on CloudWatch are set to monitor basic metrics (shown in the table):

  • CPU: CPU Utilization, CPU credit usage (count), CPU credit balance (count)
  • Disk: Disk reads (bytes), Disk read operations (operations), Disk writes (bytes), Disk write operations (operations)
  • Network: Network in (bytes), Network out (bytes), Network packets in (count), Network packets out (count)
  • Status check failed (count): any, instance, system

What Basic CloudWatch Monitoring for EC2 Looks Like

Ensure you are in the region where your EC2 resources have been provisioned (highlighted top right) and switch to the Monitoring tab.

Once switched to the monitoring tab, you will see graphs of the basic metrics. From here you can opt to configure “Detailed monitoring”.

In practice, detailed monitoring means adjusting the basic metrics to be collected more frequently than the default 5-min sampling interval. If you choose to do this – additional charges will be incurred and you are warned of this (although no details are given as to how much these additional charges will be).

In my next blog post, I will cover how you can gain additional insights above agentless monitoring by deploying the CloudWatch agent, what this provides, and some of the limitations you need to plan for. I will also cover some of the cost and licensing implications of going beyond the free and basic tiers of CloudWatch.

Learn More