{"id":16418,"date":"2021-09-14T08:43:13","date_gmt":"2021-09-14T12:43:13","guid":{"rendered":"https:\/\/www.eginnovations.com\/blog\/?p=16418"},"modified":"2022-10-31T08:30:00","modified_gmt":"2022-10-31T12:30:00","slug":"aws-ec2-monitoring-tools","status":"publish","type":"post","link":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/","title":{"rendered":"Case Study: Importance of Choosing the Right AWS EC2 Instance Type"},"content":{"rendered":"<div class=\"inner_content\">\n<h2>Choice of EC2 Instance Can Adversely Impact Application Performance<\/h2>\n<div class=\"link_list_style\" style=\"margin: 0px auto 20px; padding: 20px 20px 10px;\">\n<p style=\"margin-bottom: 15px;\"><strong>Summary<\/strong><\/p>\n<p>It is incumbent on cloud operations teams to choose the correct type of AWS EC2 instance relative to the underlying application. The wrong choice could adversely impact business and user experience.<\/p>\n<p>This article walks through a customer case study where the EC2 instance choice impacted their business.\u00a0 It is important for application and operations (AppOps) teams to collaborate closely in understanding the workload characteristics and performance expectations while also keeping cloud cost implications in mind. Setting the right alarms and taking corrective actions is also key for AppOps teams for delivering successful applications in the cloud.<\/p>\n<\/div>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-16434\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-ec2-logo.jpg\" alt=\"Amazon \u2013 AWS EC2 logo\" width=\"150\" height=\"150\" border=\"0\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-ec2-logo.jpg 150w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-ec2-logo-140x140.jpg 140w\" sizes=\"auto, (max-width: 150px) 100vw, 150px\" \/>We were recently approached by a SaaS service provider to assist with a performance issue they had encountered with a web-based application. They were looking to expand their service but had been experiencing intermittent performance issues followed by near-catastrophic gridlocks with their deployment on AWS cloud.<\/p>\n<p>They were already using <a href=\"https:\/\/www.eginnovations.com\/product\/application-performance-monitoring\">eG Enterprise<\/a> for their on-premises infrastructure and applications and were looking to see if we could help them with their application performance issues on AWS cloud as well.<\/p>\n<p>Their cloud operations\/ site reliability engineering (SRE) team, however, were yet to start using eG Enterprise and were still using their legacy workflows reliant on native cloud tooling and asked us to help them track down their issue which had become business critical.<\/p>\n<p>This article highlights how we were able to identify the issue. Interestingly, it was a combination of a software change and the impact it had upon their CPU credit balance of their AWS EC2 instances, which were using a burstable instance type (T2.xlarge).<\/p>\n<h2>A Quick Primer on EC2 Instances<\/h2>\n<p>Amazon Elastic Compute Cloud (EC2) is an IaaS offering from AWS \u00a0using which customers can provision virtual machines (VMs) and infrastructure resources. AWS EC2 provides a wide selection of instance types optimized to fit different use cases.<\/p>\n<p>Instance types are available in varying combinations of CPU, memory, storage, and networking capacity and give you the flexibility to choose the appropriate mix of resources for your applications. Each instance type includes one or more instance sizes, allowing you to scale your resources to the requirements of your target workload.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16433\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-ec2-instance-selection.jpg\" alt=\"AWS EC2 Instance Selection\" width=\"800\" height=\"400\" border=\"0\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-ec2-instance-selection.jpg 800w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-ec2-instance-selection-300x150.jpg 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-ec2-instance-selection-768x384.jpg 768w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-ec2-instance-selection-310x155.jpg 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-ec2-instance-selection-140x70.jpg 140w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><\/p>\n<p style=\"margin-bottom: 15px;\">The list of EC2 instance types is available <a href=\"https:\/\/aws.amazon.com\/ec2\/instance-types\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">here<\/a>. You can choose between:<\/p>\n<ul>\n<li>General Purpose<\/li>\n<li>Compute Optimized<\/li>\n<li>Memory Optimized<\/li>\n<li>Storage Optimized<\/li>\n<li>Accelerated Computing instance types<\/li>\n<\/ul>\n<div class=\"link_list_style\" style=\"margin: 0px auto 20px; padding: 20px 20px 10px;\">\n<h2 style=\"margin-top: 5px;\">AWS EC2 CPU Credits explained &#8211; How are CPU credits used in EC2 T2 instances?<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-19146\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/coins.png\" alt=\"\" width=\"85\" height=\"85\" border=\"0\" \/>The fundamental idea behind burstable instances is that a typical server has peaks and valleys in its performance baseline \u2013 it doesn\u2019t run flat-out at the full utilization of 100%. If you had an opportunity to \u201cbank\u201d credits when your server is idle, you could save considerable money.<\/p>\n<p>CPU Credits allow EC2 instances to burst above an initial assigned CPU <a href=\"https:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/burstable-credits-baseline-concepts.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">baseline of performance.<\/a> The instance can earn and bank CPU credits while it is not running under load and burst above their initial CPU baseline as required.<\/p>\n<p>CPU credits are a function of the number of CPUs, utilization and time<br \/>\n1 CPU credit = 1 vCPU * 100% utilization * 1 minute.<\/p>\n<p>EC2 instances earn CPU credits continuously. \u00a0The rate at which credits are earned <a href=\"https:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/burstable-credits-baseline-concepts.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">depends on the instance size.<\/a><\/p>\n<p>eG Enterprise <a href=\"https:\/\/www.eginnovations.com\/supported-technologies\/aws-monitoring\" rel=\"nofollow noopener noreferrer\">can monitor AWS CPU credit usage<\/a> on EC2 instances and alert you when CPU credits are running low. See screenshots in this blog.<\/p>\n<\/div>\n<p style=\"margin-bottom: 15px;\">As the name suggests, the General Purpose category is targeted toward general use cases and there are two broad EC2 families &#8211; the T family and the M family. \u00a0The \u201ct\u201d stands for tiny and the \u201cm\u201d stands for micro or medium. Two of the most popular choices include:<\/p>\n<ul>\n<li>T2 instances, which are Burstable Performance Instances that provide a baseline level of CPU performance with the ability to burst above the baseline. There are also T3 instances which provide ability to burst but also have an unlimited mode by default to ensure performance during peak periods.<\/li>\n<li>M5 instances, which are the latest generation of General Purpose Instances. This family provides a balance of compute, memory, and network resources, and it is a good choice for many applications.<\/li>\n<\/ul>\n<p>When choosing EC2, often, the initial choice of instance is a bit of a guess based on estimates or experience. Often, especially in pre-production, teams creating the instance types may not have prior experience of the application or its workload, especially at scale in production.<\/p>\n<h2>AWS EC2 Instance Selection<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-16511\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/seesaw-01.jpg\" alt=\"\" width=\"400\" height=\"200\" border=\"0\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/seesaw-01.jpg 400w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/seesaw-01-300x150.jpg 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/seesaw-01-310x155.jpg 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/seesaw-01-140x70.jpg 140w\" sizes=\"auto, (max-width: 400px) 100vw, 400px\" \/>In this case, considering the cost differences between T2 and M5 instances, a choice of the T2.xlarge had been made by the customer\u2019s infrastructure team. The customer\u2019s application team had specified the number of CPUs, RAM, and storage requirement but they were relatively new to the cloud and hence had not specified which instance type to use. The infrastructure team had been influenced by an internal discussion on the value proposition of burstable instances. They were aware that their production system was sized for peak demand, and that often, the demands on the system were lower and there had been a discussion about the business case for trying burstable instances. However, because they were reliant on native cloud monitoring tools, no real forecasting had been done and the organization had no experience of the issues they were about to encounter soon.<\/p>\n<h2>The Application Slowness Issue<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-16514\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/Hourglass.jpg\" alt=\"\" width=\"100\" height=\"100\" border=\"0\" \/>The instances were commissioned, and applications deployed in pre-production for weeks, without any significant issues. The deployed applications had been deployed in many on-premises environments, and hence, were stable and well tested. There had been a few complaints of intermittent application sluggishness, but no significant performance challenges were observed in the pre-production pilot for several weeks. No new tickets were raised and the team running the system were under the impression that everything was working fine.<\/p>\n<p>After several weeks of operation, suddenly extreme slowness was observed with the application. There were many user complaints and clearly there was a significant problem at hand.\u00a0 Therefore, the customer\u2019s application team decided to deploy eG Enterprise.<\/p>\n<p>As the application being considered was a web application, <a href=\"https:\/\/www.eginnovations.com\/synthetic-monitoring\">eG Enterprise\u2019s web protocol synthetic monitoring capability<\/a> had been configured. The metrics reported by this monitor indicated several time periods where the application had been unavailable and cases of severe slowness. In the graph below, you can see that from July 11 onwards, web application response time had shot up from a few milliseconds to over 10 seconds at times.<\/p>\n<p>Clearly, the problem was severe and was impacting the SLAs for the SaaS service.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/tcp-connection-time-view.jpg\" data-rel=\"lightbox-image-0\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16431\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/tcp-connection-time.jpg\" alt=\"TCP Connection Time illustration\" width=\"800\" height=\"342\" border=\"0\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/tcp-connection-time.jpg 800w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/tcp-connection-time-300x128.jpg 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/tcp-connection-time-768x328.jpg 768w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/tcp-connection-time-310x133.jpg 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/tcp-connection-time-140x60.jpg 140w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 1: TCP connection time during HTTPs synthetic monitoring of the web application<\/div>\n<h2>Configuring End-to-End Monitoring of AWS and Application Performance<\/h2>\n<p style=\"margin-bottom: 15px;\">To provide additional diagnosis, the following monitoring capabilities were configured:<\/p>\n<ul>\n<li style=\"text-align: left;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-16430\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/real-user-monitoring.jpg\" alt=\"Real User Monitoring in the AWS EC2 environment\" width=\"400\" height=\"260\" border=\"0\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/real-user-monitoring.jpg 400w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/real-user-monitoring-300x195.jpg 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/real-user-monitoring-310x202.jpg 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/real-user-monitoring-140x91.jpg 140w\" sizes=\"auto, (max-width: 400px) 100vw, 400px\" \/>The application team configured <a href=\"https:\/\/www.eginnovations.com\/real-user-monitoring\">real user monitoring<\/a> (RUM) to track what pages were accessed by users and how long the page load times were.<\/li>\n<li style=\"text-align: left;\">They also monitored the <a href=\"https:\/\/www.eginnovations.com\/supported-technologies\/database-monitoring\">backend database performance<\/a> to see if any abnormalities in the database tier were causing application slowdowns.<\/li>\n<li>They also configured an agent inside the EC2 instance to track performance within the EC2 instance.<\/li>\n<\/ul>\n<h2>Troubleshooting the Web Application Slowness Issue<\/h2>\n<p>This section highlights how eG Enterprise helped identify and resolve the web application slowness issue.<\/p>\n<p>The eG agent deployed within the EC2 instance immediately highlighted the severity of the problem. CPU usage was pegged at 100% for several hours, though we could not attribute this to one single application or process.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-usage-ec2-instance-full.jpg\" data-rel=\"lightbox-image-1\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16424\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-usage-ec2-instance.jpg?no\" alt=\"CPU Usage within the EC2 instance\" width=\"800\" height=\"399\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 2: CPU usage within the EC2 instance has been at 100% for hours<\/div>\n<p>We also observed that <a href=\"https:\/\/www.eginnovations.com\/product\/capabilities\/change-configuration-tracking\">eG Enterprise\u2019s configuration and change tracking capability<\/a> indicated that a new version of a threat protection software had been installed a few days before the problems had started Suspecting that this could have been the problem, the threat protection software was uninstalled. This immediately brought the CPU usage within the EC2 instance to the normal limit, as you can see in Figure 3.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-usage-after-threat-protection-removed-view.png\" data-rel=\"lightbox-image-2\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16440\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-usage-after-threat-protection-removed.jpg\" alt=\" CPU usage of the EC2 instance dropped after threat protection removal\" width=\"800\" height=\"342\" border=\"0\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-usage-after-threat-protection-removed.jpg 800w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-usage-after-threat-protection-removed-300x128.jpg 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-usage-after-threat-protection-removed-768x328.jpg 768w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-usage-after-threat-protection-removed-310x133.jpg 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-usage-after-threat-protection-removed-140x60.jpg 140w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 3: CPU usage of the EC2 instance dropped after the threat protection software was removed<\/div>\n<p>Given that the CPU usage inside the EC2 instance had dropped, we expected that the application performance would be back to normal. We could see normalcy for a few hours, but soon thereafter, performance issues were seen again, even though no unusual activity was seen within the EC2 instance. Not only was the web application slow to respond, but admin access to the system also was unbearably slow. Even typing a single character on the command line took seconds. There was no indication of the cause of the problem when looking at any of the metrics from within the EC2 instance.<\/p>\n<p>Suspecting that the issue could be somewhere in the AWS infrastructure, we moved to the IT infrastructure team\u2019s view. The infrastructure team had integrated AWS CloudWatch with eG Enterprise to track the performance of different EC2 services. Figure 4 below shows the type of metrics obtained using this integration.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/ec2-metrics-view.png\" data-rel=\"lightbox-image-3\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16426\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/ec2-metrics.jpg\" alt=\"EC2 monitoring metrics\" width=\"800\" height=\"420\" border=\"0\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/ec2-metrics.jpg 800w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/ec2-metrics-300x158.jpg 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/ec2-metrics-768x403.jpg 768w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/ec2-metrics-310x163.jpg 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/ec2-metrics-140x74.jpg 140w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 4: Metrics reported by eG Enterprise about an EC2 instance<\/div>\n<p>Analysis of the alerts in the eG Enterprise console enabled us to check on the CPU credit balance.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/fig5-view.png\" data-rel=\"lightbox-image-4\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16428\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/fig5.jpg\" alt=\"\" width=\"800\" height=\"84\" border=\"0\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/fig5.jpg 800w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/fig5-300x32.jpg 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/fig5-768x81.jpg 768w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/fig5-310x33.jpg 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/fig5-140x15.jpg 140w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 5: Alerts in the eG Enterprise console about CPU credit balance being low<\/div>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-credit-balance-change-full.png\" data-rel=\"lightbox-image-5\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16438\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/cpu-credit-balance-change.jpg?no\" alt=\"Illustration of CPU credit balance change\" width=\"800\" height=\"342\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 6: CPU credit balance change for the EC2 instance<\/div>\n<p>Figure 6 above shows that the EC2 instance\u2019s CPU credit balance had been dropping continuously for several days \u2013 because of the increased CPU load caused by the threat protection software. When the CPU credit balance became 0, the workload on the EC2 instance was too much for it to handle with the CPU throttling done by AWS.\u00a0 This caused CPU usage within the EC2 instance to increase drastically. After the threat protection software was removed and the EC2 instance rebooted, the problem was still not resolved because the CPU credit balance remained very low for several hours. Until the credit balance reached an acceptable number (50+), the system remained almost unresponsive.<\/p>\n<p>How burstable EC2 instances like the T2 series consume and accrue credit is covered very well in <a href=\"https:\/\/d1.awsstatic.com\/whitepapers\/t2-std-cpu-credits.pdf\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Understanding T2 Standard Instance CPU Credits (awsstatic.com)<\/a>.<\/p>\n<p>Only after this analysis did the application team even become aware that the infrastructure team had provisioned a T2.xlarge instance for them with burstable capacity.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/product\/application-performance-monitoring\/free-trial\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-19729 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-monitoring-banner.jpg\" alt=\"\" width=\"850\" height=\"170\" border=\"0\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-monitoring-banner.jpg 850w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-monitoring-banner-300x60.jpg 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-monitoring-banner-768x154.jpg 768w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-monitoring-banner-800x160.jpg 800w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-monitoring-banner-310x62.jpg 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/aws-monitoring-banner-140x28.jpg 140w\" sizes=\"auto, (max-width: 850px) 100vw, 850px\" \/><\/a><\/p>\n<h2>How the Customer Moved Forward<\/h2>\n<p style=\"margin-bottom: 15px;\">With their new understanding of the diagnosis of the issues, the customer took several subsequent steps:<\/p>\n<ul>\n<li>They contacted the security vendor and queried about the high CPU load and were advised on some configuration changes that improved the CPU usage.<\/li>\n<li>They were able to reintroduce the security software and baseline by using eG Enterprise to understand the CPU pattern of their application workload over time.<\/li>\n<li>Going forward, they will look at a structured benchmark of C and M series instances, T3.xlarge instances (higher baseline for bursting), and will explore the unlimited performance mode for burstable instances. Unlimited mode allows you to exceed the baseline of a burstable instance and charges you for the excess CPU cycles consumed; this is one metric that we have advised them to watch closely.<\/li>\n<\/ul>\n<table class=\"hand_table_style\" style=\"width: 100%;\">\n<tbody>\n<tr>\n<td style=\"padding: 0;\">\n<div style=\"border-left: 0px solid #ffd392; line-height: 30px; font-family: 'Graphik-Regular';\">\n<p style=\"margin-top: 10px; margin-bottom: 15px;\"><strong>Key points on Unlimited mode on burstable VMs :<\/strong><\/p>\n<ul class=\"hand-icon-resize\" style=\"list-style-type: none;\">\n<li style=\"list-style: none !important; margin-left: 3px;\">Don\u2019t let the term \u201cUnlimited\u201d mislead you into thinking that CPU is unlimited.<\/li>\n<li style=\"list-style: none !important; margin-left: 3px;\">Even in unlimited mode, if your\u00a0average CPU usage over a 24-hour period exceeds the\u00a0baseline, you\u00a0will be\u00a0billed for the additional usage at a flat additional rate per vCPU-hour. (Ref:\u00a0<a href=\"https:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/burstable-credits-baseline-concepts.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">https:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/burstable-credits-baseline-concepts.html<\/a>).<\/li>\n<li style=\"list-style: none !important; margin-left: 3px;\">Instead of being throttled, unlimited mode can allow your instance to burst beyond the <a href=\"https:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/burstable-credits-baseline-concepts.html#baseline_performance\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">baseline<\/a>. However, this will add extra charges to your AWS .<\/li>\n<li style=\"list-style: none !important; margin-left: 3px;\">You can think of unlimited mode as an \u201cunlimited burst\u201d mode, where more bursting is likely to cost you more money, but you can burst as much as your application requires.<\/li>\n<\/ul>\n<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Key Takeaways<\/h2>\n<p style=\"margin-bottom: 15px;\">This example highlights several challenges faced by organizations, who are migrating applications to the cloud:<\/p>\n<ol>\n<li><b>Work collaboratively:<\/b> Application and infrastructure teams must collaboratively work together to determine the type of AWS EC2 instance to be used. Just specifying the required CPU\/memory is not sufficient.<\/li>\n<li>\n<p style=\"margin-bottom: 10px;\"><b>Run a benchmark:<\/b> Baseline application performance as accurately as possible and choose the AWS instance type based on that data. T2\/T3\/T4 instances are cheaper than M and C types but there are challenges if the burst usage exceeds the capacity limit.<\/p>\n<p style=\"margin-bottom: 10px;\">When choosing a T2\/T3\/T4 instance, there is an unlimited performance mode setting for instances &#8211; <a href=\"https:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/burstable-performance-instances-unlimited-mode.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">\u00a0https:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/burstable-performance-instances-unlimited-mode.html<\/a>\u00a0You can decide to enable this (at a cost) to ensure that you\u00a0don\u2019t\u00a0get into the situation we discussed in this case study.<\/p>\n<\/li>\n<li><b>Outside-in &amp; Inside-out:<\/b> Problems with AWS instance slowness cannot be addressed by looking at metrics from within the EC2 instance VM. It is important to combine performance views from within the EC2 instance VM and the AWS-level insights. <a href=\"https:\/\/www.eginnovations.com\/product\/capabilities\/change-configuration-tracking\">Configuration and change tracking<\/a> is equally important to detect changes in the infrastructure and correlate with application performance issues.<\/li>\n<li><b>Set up alerts:<\/b> Monitor your CPU credits and set up alarms when your credit balance becomes dangerously low. Even better, consider configuring an Autoscaling group so that it launches a new instance when your CPU credit is low or when your CPU usage is high for a threshold period.<\/li>\n<li><b>Be proactive:<\/b> When a problem is rectified on-premises, you can expect your application and infrastructure to function well right after the problem is resolved. Our experience highlights that this is not always the case on the cloud. Hence, it is even more important that you monitor your cloud environments proactively and make sure that you\u00a0do not\u00a0let a problem escalate to a level where it impacts your business.<\/li>\n<\/ol>\n<h3>More information<\/h3>\n<ul>\n<li>Information on eG Enterprise\u2019s solutions for Amazon AWS: <a href=\"https:\/\/www.eginnovations.com\/supported-technologies\/aws-monitoring\">AWS Monitoring Solutions &amp; Performance Tools | eG Innovations<\/a><\/li>\n<\/ul>\n<p><strong><em>Customer confidentiality<\/em><\/strong> : We have kept the details of the target infrastructure and the customer\u2019s identity confidential, but with their permission, we have highlighted the challenge they faced and how it was solved so that the community at large can benefit from our experience.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Choice of EC2 Instance Can Adversely Impact Application Performance Summary It is incumbent on cloud operations teams to choose the correct type of AWS EC2 instance relative to the underlying application. The wrong choice could adversely impact business and user experience. This article walks through a customer case study where the EC2 instance choice impacted [&hellip;]<\/p>\n","protected":false},"author":8,"featured_media":22486,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[391,369],"tags":[395,1464,509,403,1434,545,110,115,1440,1458,1457,549,1463,1460,1461,1462,1459],"class_list":["post-16418","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-aws-monitoring","category-cloud-monitoring","tag-aws","tag-aws-application","tag-aws-ec2","tag-aws-monitoring","tag-aws-monitoring-tools","tag-aws-performance","tag-cloud","tag-cloud-monitoring","tag-cloud-monitoring-tools","tag-cpu-credit-balance","tag-cpu-credits","tag-ec2","tag-ec2-cloudwatch-metrics","tag-ec2-hang","tag-ec2-instance-monitoring","tag-ec2-metrics","tag-ec2-slow"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Select the right AWS EC2 instance type for optimal monitoring<\/title>\n<meta name=\"description\" content=\"Read this customer case study and learn why selecting the right AWS EC2 instance type is important for optimal monitoring &amp; performance.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"AWS EC2 Monitoring Tools | eG Innovations\" \/>\n<meta property=\"og:description\" content=\"Amazon Elastic Compute Cloud (EC2) is an AWS service that offers virtual machines (VMs) and infrastructure resources. Find out how to monitor these important functions.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/\" \/>\n<meta property=\"og:site_name\" content=\"eG Innovations\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/eGInnovations\" \/>\n<meta property=\"article:published_time\" content=\"2021-09-14T12:43:13+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-10-31T12:30:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/Amazon-AWS-Cloud-Social-Banner.jpg\" \/>\n<meta name=\"author\" content=\"Arun Aravamudhan\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:title\" content=\"AWS EC2 Monitoring Tools | eG Innovations\" \/>\n<meta name=\"twitter:description\" content=\"Amazon Elastic Compute Cloud (EC2) is an AWS service that offers virtual machines (VMs) and infrastructure resources. Find out how to monitor these important functions.\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/Amazon-AWS-Cloud-Social-Banner.jpg\" \/>\n<meta name=\"twitter:creator\" content=\"@https:\/\/x.com\/perfclarity\" \/>\n<meta name=\"twitter:site\" content=\"@eginnovations\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Arun Aravamudhan\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"13 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Select the right AWS EC2 instance type for optimal monitoring","description":"Read this customer case study and learn why selecting the right AWS EC2 instance type is important for optimal monitoring & performance.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/","og_locale":"en_US","og_type":"article","og_title":"AWS EC2 Monitoring Tools | eG Innovations","og_description":"Amazon Elastic Compute Cloud (EC2) is an AWS service that offers virtual machines (VMs) and infrastructure resources. Find out how to monitor these important functions.","og_url":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/","og_site_name":"eG Innovations","article_publisher":"https:\/\/www.facebook.com\/eGInnovations","article_published_time":"2021-09-14T12:43:13+00:00","article_modified_time":"2022-10-31T12:30:00+00:00","og_image":[{"url":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/Amazon-AWS-Cloud-Social-Banner.jpg","type":"","width":"","height":""}],"author":"Arun Aravamudhan","twitter_card":"summary_large_image","twitter_title":"AWS EC2 Monitoring Tools | eG Innovations","twitter_description":"Amazon Elastic Compute Cloud (EC2) is an AWS service that offers virtual machines (VMs) and infrastructure resources. Find out how to monitor these important functions.","twitter_image":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/Amazon-AWS-Cloud-Social-Banner.jpg","twitter_creator":"@https:\/\/x.com\/perfclarity","twitter_site":"@eginnovations","twitter_misc":{"Written by":"Arun Aravamudhan","Est. reading time":"13 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/#article","isPartOf":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/"},"author":{"name":"Arun Aravamudhan","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/person\/d788cb81df96a940429c3f5a3b294a6a"},"headline":"Case Study: Importance of Choosing the Right AWS EC2 Instance Type","datePublished":"2021-09-14T12:43:13+00:00","dateModified":"2022-10-31T12:30:00+00:00","mainEntityOfPage":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/"},"wordCount":2317,"commentCount":0,"publisher":{"@id":"https:\/\/www.eginnovations.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/#primaryimage"},"thumbnailUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/Amazon-AWS-Cloud-Thumbnail-1.jpg","keywords":["AWS","aws application","AWS EC2","AWS Monitoring","AWS monitoring tools","AWS Performance","cloud","Cloud Monitoring","cloud monitoring tools","CPU credit balance","CPU credits","EC2","Ec2 cloudwatch metrics","EC2 hang","Ec2 instance monitoring","EC2 metrics","EC2 slow"],"articleSection":["AWS Monitoring","Cloud Monitoring"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/","url":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/","name":"Select the right AWS EC2 instance type for optimal monitoring","isPartOf":{"@id":"https:\/\/www.eginnovations.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/#primaryimage"},"image":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/#primaryimage"},"thumbnailUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/Amazon-AWS-Cloud-Thumbnail-1.jpg","datePublished":"2021-09-14T12:43:13+00:00","dateModified":"2022-10-31T12:30:00+00:00","description":"Read this customer case study and learn why selecting the right AWS EC2 instance type is important for optimal monitoring & performance.","breadcrumb":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/#primaryimage","url":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/Amazon-AWS-Cloud-Thumbnail-1.jpg","contentUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/09\/Amazon-AWS-Cloud-Thumbnail-1.jpg","width":362,"height":235},{"@type":"BreadcrumbList","@id":"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.eginnovations.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Case Study: Importance of Choosing the Right AWS EC2 Instance Type"}]},{"@type":"WebSite","@id":"https:\/\/www.eginnovations.com\/blog\/#website","url":"https:\/\/www.eginnovations.com\/blog\/","name":"eG Innovations","description":"IT Performance Monitoring Insights","publisher":{"@id":"https:\/\/www.eginnovations.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.eginnovations.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.eginnovations.com\/blog\/#organization","name":"eG Innovations","alternateName":"eg innovations","url":"https:\/\/www.eginnovations.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2014\/07\/eg-logo-dark-gray1_new.jpg","contentUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2014\/07\/eg-logo-dark-gray1_new.jpg","width":362,"height":235,"caption":"eG Innovations"},"image":{"@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/eGInnovations","https:\/\/x.com\/eginnovations"]},{"@type":"Person","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/person\/d788cb81df96a940429c3f5a3b294a6a","name":"Arun Aravamudhan","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/7ff42334d908fb4060880a4487331e4a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/7ff42334d908fb4060880a4487331e4a?s=96&d=mm&r=g","caption":"Arun Aravamudhan"},"sameAs":["https:\/\/www.linkedin.com\/in\/arun-aravamudhan\/","https:\/\/x.com\/https:\/\/x.com\/perfclarity"],"url":"https:\/\/www.eginnovations.com\/blog\/author\/arun-aravamudhan\/"}]}},"modified_by":"Review eG","_links":{"self":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/posts\/16418","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/comments?post=16418"}],"version-history":[{"count":0,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/posts\/16418\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/media\/22486"}],"wp:attachment":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/media?parent=16418"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/categories?post=16418"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/tags?post=16418"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}