{"id":38558,"date":"2025-11-10T12:49:15","date_gmt":"2025-11-10T17:49:15","guid":{"rendered":"https:\/\/www.eginnovations.com\/blog\/?p=38558"},"modified":"2025-11-11T07:13:42","modified_gmt":"2025-11-11T12:13:42","slug":"aws-outage","status":"publish","type":"post","link":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/","title":{"rendered":"Detecting an AWS Outage and DR Lessons"},"content":{"rendered":"<div class=\"inner_content\">\n<p>A few weeks ago, on 20th October 2025, AWS suffered a widespread outage in its US-EAST-1 region that affected a large number of customers globally. More than 1,000 apps and websites were impacted including major banks and popular games, streaming and social platforms such as WhatsApp, Snapchat, Fortnite and Pok\u00e9mon Go.<\/p>\n<p style=\"margin-bottom: 15px;\">The incident was widely reported, see:<\/p>\n<ul>\n<li><a class=\"link\" href=\"https:\/\/www.crn.com\/news\/cloud\/2025\/aws-15-hour-outage-5-big-ai-dns-ec2-and-data-center-keys-to-know?page=1&amp;itc=refresh\" target=\"blank\">AWS\u2019 15-Hour Outage: 5 Big AI, DNS, EC2 And Data Center Keys To Know<\/a><\/li>\n<li><a class=\"link\" href=\"https:\/\/www.bbc.co.uk\/news\/articles\/c20pgp3nx07o\" target=\"blank\">Amazon services &#8216;recovering&#8217; as Snapchat and banks among sites hit by outage &#8211; BBC News<\/a><\/li>\n<li><a class=\"link\" href=\"https:\/\/www.bbc.co.uk\/news\/articles\/cev1en9077ro\" target=\"blank\">What caused the AWS outage &#8211; and why did it make the internet fall apart? &#8211; BBC News<\/a><\/li>\n<li><a class=\"link\" href=\"https:\/\/www.aljazeera.com\/news\/2025\/10\/21\/what-caused-amazons-aws-outage-and-why-did-so-many-major-apps-go-offline\" target=\"blank\">What caused Amazon\u2019s AWS outage, and why did so many major apps go offline? | Internet News | Al Jazeera<\/a><\/li>\n<\/ul>\n<p>A large list of the affected apps and sites <a class=\"link\" href=\"https:\/\/news.sky.com\/story\/whats-affected-by-internet-outage-all-we-know-so-far-13453813\" target=\"blank\">was recorded in a Sky News article<\/a> that really cements the scale and visibility of the outage that impacted millions of end users (over 4 million end users reported issues <a class=\"link\" href=\"https:\/\/downdetector.co.uk\/status\/aws-amazon-web-services\/\" target=\"blank\">via Downdetector<\/a>) trying to use apps and access services (see: <a class=\"link\" href=\"https:\/\/news.sky.com\/story\/whats-affected-by-internet-outage-all-we-know-so-far-13453813\" target=\"blank\">What&#8217;s affected by internet outage &#8211; all we know so far | Science, Climate &amp; Tech News | Sky News<\/a>).<\/p>\n<p>eG Innovations is an AWS well-architected framework SaaS provider and we host a number of our SaaS services on AWS. These service are engineered in collaboration with Amazon to adhere to their best practices for resilience and security. Like all those other key customers, we were impacted and had to work with effects of the outage whilst Amazon worked to rectify the issue.<\/p>\n<p>While monitoring our AWS services, we captured data around the outage that provided visibility on the issue and allowed us to make data-driven assessments of the situation. In this article we will share what we saw and what lessons you can take from this AWS outage to improve your resilience when designing services reliant on AWS and how to monitor critical AWS services to your benefit.<\/p>\n<h2>What Happened to AWS Services on 20th October 2025?<\/h2>\n<p>On 20 October 2025, AWS suffered a major outage that affected a wide range of websites, apps and services globally.<\/p>\n<p>The incident began at around 3:11AM ET (12:11AM PDT) (early morning on 20 October) with AWS reporting \u201cincreased error rates and latencies\u201d for multiple services in its US-EAST-1 region (Northern Virginia).<\/p>\n<p>The root cause was eventually identified as a DNS (Domain Name System) resolution issue affecting the AWS internal infrastructure, particularly related to the Amazon DynamoDB API endpoint and other dependent services.<\/p>\n<p>AWS stated that services were \u201cfully returned to normal operations\u201d by around 6:01PM ET (3:01PM PDT), i.e. a 15 hour outage was experienced.<\/p>\n<p>Because the US-EAST-1 region is a major hub for AWS services, many other services relied on it by default; when the DNS resolution for DynamoDB and associated services failed, the effect cascaded to many dependent applications.<\/p>\n<p>The incident highlighted the heavy dependency of modern internet services on a small number of cloud infrastructure providers, and how a failure in one major region can ripple widely.<\/p>\n<p>Full details of the timelines and technicalities are available on Amazon\u2019s AWS Health Portal, see: <a class=\"link\" href=\"https:\/\/health.aws.amazon.com\/health\/status?eventID=arn:aws:health:us-east-1::event\/MULTIPLE_SERVICES\/AWS_MULTIPLE_SERVICES_OPERATIONAL_ISSUE\/AWS_MULTIPLE_SERVICES_OPERATIONAL_ISSUE_BA540_514A652BE1A\" target=\"blank\">Service health &#8211; Oct 23, 2025 | AWS Health Dashboard | Global<\/a>. The incident impacted 141 AWS services and caused connectivity issues in multiple services such as Lambda, DynamoDB, and CloudWatch. This is particularly relevant to this article as CloudWatch is Amazon\u2019s native monitoring service for AWS and indeed one service we integrate with to give visibility on AWS for our customers.<\/p>\n<h2>How the AWS Outage Manifested in eG Enterprise<\/h2>\n<p>The eG Enterprise Monitoring solution does not rely on DynamoDB, where the primary issue arose, but was impacted by the effects to services impacted as a secondary effect. Secondary effects that also meant that the data from <a href=\"https:\/\/www.eginnovations.com\/glossary\/amazon-cloudwatch\">AWS CloudWatch<\/a> was unable to report the issues. Monitoring relying on CloudWatch was effectively rendered useless by this point.<\/p>\n<p>At 5:20AM ET (2:20AM PDT) eG Enterprise started to detect AWS issues. EBS volumes changed to an abnormal state.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-EBS-volumes-zoom.jpg\" data-rel=\"lightbox-image-0\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-38578 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-EBS-volumes.webp\" alt=\"Screenshot of alerts in the eG Enterprise console that were raised during the 20th October AWS outage because of AWS EBS experiencing issues\" width=\"750\" height=\"219\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-EBS-volumes.webp 750w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-EBS-volumes-300x88.webp 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-EBS-volumes-310x91.webp 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-EBS-volumes-140x41.webp 140w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 1: Alerts regarding the AWS EBS volumes were raised at around 5:30AM ET by eG Enterprise<\/div>\n<p>Within a short time, monitoring of our EC2 instances showed volumes being disconnected. This impacted applications and raised incidents in our service desk automatically. Meanwhile, Amazon were reporting that they had resolved the DynamoDB DNS issue but that issues had begun occurring in the \u201cinternal subsystem of EC2 that is responsible for launching EC2 instances due to its dependency on DynamoDB,\u201d.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/instance-EC2-zoom.jpg\" data-rel=\"lightbox-image-1\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-38575 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/instance-EC2.webp\" alt=\"Screenshot of alerts in the eG Enterprise console that were raised during the 20th October AWS outage because of AWS EC2 experiencing issues with their AWS EBS Volumes\" width=\"750\" height=\"128\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/instance-EC2.webp 750w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/instance-EC2-300x51.webp 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/instance-EC2-310x53.webp 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/instance-EC2-140x24.webp 140w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 2: The monitoring of EC2 instances detected issues with the EBS volumes<\/div>\n<p>We also noticed that while some of the EBS volumes recovered, several others remained off for hours as reported by CloudWatch.<\/p>\n<h2>Synthetic Monitoring of AWS<\/h2>\n<p>We proactively use eG Enterprise\u2019s synthetic monitoring features to continually test and probe the user experience of our SaaS services. Simulated (\u201crobot\u201d) users attempt to use our SaaS services to probe the experience external customers are receiving. This testing allowed us to assess the impact of the AWS outages on our customers.<\/p>\n<p>Our synthetic monitoring showed only a small impact on our SaaS services even when CloudWatch reported to show the volumes to be down for several hours. We also had agents deployed within our EC2 instances and they also indicated that the OS drives were indeed working and available. With the benefit of the retrospective information from Amazon, it seems that CloudWatch struggled to handle the deluge of alerts caused by the outage and was updating metrics extremely slowly. So, even though our services were working, CloudWatch continued to show alerts for EBS services.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/Synthetic-monitoring-zoom.jpg\" data-rel=\"lightbox-image-2\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-38567 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/Synthetic-monitoring.webp\" alt=\"Screenshot of eG Enterprise's SaaS service availability during the AWS outage of 20th October 2025 as measured via synthetic monitoring. Some service are showing 100% availability and all &gt;99%\" width=\"750\" height=\"364\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/Synthetic-monitoring.webp 750w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/Synthetic-monitoring-300x146.webp 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/Synthetic-monitoring-310x150.webp 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/Synthetic-monitoring-140x68.webp 140w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 3: Synthetic monitoring showed some services degraded but the impact was low with over 99% of all web accesses succeeding.<\/div>\n<h2>Email Alerting \u2013 The Importance of a Backup System<\/h2>\n<p>While all of our systems remained up, some of our services were impacted. Email alerting on our affected SaaS systems was also configured to use AWS SES on US-EAST1 and this service remained down for a while longer.<\/p>\n<p>Relying on any cloud-based email system for monitoring alerting is a classic gotcha that many overlook. Outlook via Microsoft 365 is similarly vulnerable to Azure cloud outages (an issue covered in a previous article, see: <a class=\"link\" href=\"https:\/\/www.eginnovations.com\/blog\/is-m365-down-proactive-alerting-for-microsoft-azure-outages\/\">Is M365 Down? \u2013 Proactive Alerting of a Microsoft Azure Outage<\/a>).<\/p>\n<p>eG Enterprise is designed to allow secondary email services to provide failover resilience and during this recent AWS outage our email alerting continued via a backup mail service configured to use SES on US-EAST2 to ensure that no email alerts were lost.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-outage-zoom.jpg\" data-rel=\"lightbox-image-3\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-38569 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-outage.webp\" alt=\"Screenshot showing eG Enterprise measuring failed emails within AWS SES which was affected by the AWS outage in October 2025\" width=\"750\" height=\"326\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-outage.webp 750w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-outage-300x130.webp 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-outage-310x135.webp 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/AWS-outage-140x61.webp 140w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 4: eG Enterprise detected the failures of AWS SES during the AWS outage incident on October 20th 2025<\/div>\n<h2>Lessons Learned<\/h2>\n<p style=\"margin-bottom: 15px;\">From the above postmortem of the October 20th AWS outage there are a few takeaways demonstrated for monitoring cloud systems and SaaS services reliant on cloud infrastructure, namely:<\/p>\n<ul>\n<li><strong>Do not limit your observability to one source<\/strong> or cloud native monitoring. The CloudWatch information was not trustworthy during and for a few hours post the outage.<\/li>\n<li>Ensure that you are <strong>monitoring key SaaS services using synthetic monitoring<\/strong> so you know what exactly your users are seeing. This is in fact, the best way to measure your performance vs. SLAs (Service Level Agreements).<\/li>\n<li><strong>Have agents on the cloud instances,<\/strong> this allowed us to cross-check the results that synthetic monitoring were collect. We could have logged in manually and checked, but that would have been a slow process and taken us hours to confirm.<\/li>\n<li>Always <strong>ensure key services have a fallback<\/strong> mechanism in place. In this example, having a backup email alerting configuration helped avoid an incident.<\/li>\n<\/ul>\n<div class=\"containers mb-4\" style=\"clear:both\">\n \t<div class=\"fixed-free-trial-div mb-3\" id=\"fixedsectioninfo_blog_btn\">\n \t\n \t<style>.containers_hide_row,.all_blogs_bottom{\n \tdisplay:none;\n   \n}\t<\/style>\n                <div class=\"box-style container row pt-4 pb-4  animatedParent animateOnce\" data-sequence=\"100\" style=\"border-bottom: 1px solid #ddd;border-top: 1px solid #ddd;background: #4b4b4b;padding: 15px 15px 0 15px;border-radius: 12px;\">\n                \n                <div class=\"text-center animated fadeIn go\"> \n                <p class=\"text-center mb-4\" style=\"    color: #fff;\">\n\neG Enterprise is an Observability solution for Modern IT. Monitor digital workspaces, <br\/>web applications, SaaS services, cloud and containers from a single pane of glass.\n<\/p>\n                <\/div>\n                    <div class=\"text-center pb-1 animated fadeIn go\" data-id=\"8\">\n                        <a class=\"border-btnhead-eg\"  href=\"https:\/\/www.eginnovations.com\/it-monitoring\/free-trial\"> <span style=\"font-family: GraphikMedium!important;color: #fff;\">Free Trial<\/span><\/a>\n                        <a href=\"https:\/\/www.eginnovations.com\/supported-technologies\/aws-monitoring\" class=\" border-btnhead-eg\" style=\"width:230px;   \"> <svg width=\"24\" height=\"24\" style=\"margin-top:-3px\" version=\"1.1\" id=\"Layer_1\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" x=\"0px\" y=\"0px\"\n\t viewBox=\"0 0 26.5 26.5\" style=\"enable-background:new 0 0 26.5 26.5;\" xml:space=\"preserve\">\n<style type=\"text\/css\">\n\t.st2{fill:#fff !important;stroke:#fff !important;stroke-miterlimit:10;}\n\t\n\t\t.border-btnhead:hover .st2 {\n  fill: #ffffff !important;\n  stroke: #ffffff;\n}\n<\/style>\n<g>\n\t<g>\n\t\t<path class=\"st2\" d=\"M13.3,25.8c-6.9,0-12.5-5.6-12.5-12.5S6.4,0.8,13.3,0.8s12.5,5.6,12.5,12.5S20.2,25.8,13.3,25.8z M13.3,1.8\n\t\t\tC6.9,1.8,1.8,6.9,1.8,13.3S7,24.8,13.3,24.8s11.5-5.2,11.5-11.5S19.6,1.8,13.3,1.8z M11.2,18.1c-0.2,0-0.4-0.1-0.6-0.2\n\t\t\tc-0.3-0.2-0.6-0.6-0.6-1V9.7c0-0.4,0.2-0.8,0.6-1c0.3-0.2,0.8-0.2,1.2,0l6.2,3.6c0.3,0.2,0.6,0.6,0.6,1s-0.2,0.8-0.6,1l-6.2,3.6\n\t\t\tC11.6,18,11.4,18.1,11.2,18.1z\"\/>\n\t<\/g>\n<\/g>\n<\/svg> <span style=\"font-family: GraphikMedium!important;color: #fff;\">&nbsp;See the platform<\/span><\/a>\n                    <\/div>\n                <\/div>\n                \n                 <\/div>\n            <\/div>\n<h2>Related Information<\/h2>\n<p style=\"margin-bottom: 15px;\">If you enjoyed this article, you might like to explore some other articles we have written about cloud outages:<\/p>\n<ul>\n<li><a class=\"link\" href=\"https:\/\/www.eginnovations.com\/blog\/is-azure-down-proactive-alerting-for-azure-outages\/\">Is Azure Down? &#8211; Proactive Alerting for Azure Outages<\/a> | &#8211; A similar postmortem of an outage experienced by Microsoft\u2019s Azure Cloud<\/li>\n<li><a class=\"link\" href=\"https:\/\/www.eginnovations.com\/blog\/is-m365-down-proactive-alerting-for-microsoft-azure-outages\/\">Is M365 Down? \u2013 Proactive Alerting of a Microsoft Azure Outage<\/a> \u2013 A deep-dive into Azure outage considerations for those using Office 365 \/ Microsoft 365<\/li>\n<li><a class=\"link\" href=\"https:\/\/www.eginnovations.com\/blog\/how-to-protect-your-it-ops-from-cloud-outages\/\">How to Protect your IT Ops from Cloud Outages<\/a> \u2013 A guide to best practices and methodologies for resilient cloud monitoring that protect against failures and provide redundancy during cloud outages and cloud native monitoring failures.<\/li>\n<li>For information on how we certify products in partnership with AWS, see: <a href=\"https:\/\/www.eginnovations.com\/blog\/eg-innovations-achieves-amazon-web-services-aws-digital-workplace-competency-status\/\">eG Innovations achieves Amazon Web Services (AWS) Digital Workplace Competency status<\/a><\/li>\n<\/ul>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>A few weeks ago, on 20th October 2025, AWS suffered a widespread outage in its US-EAST-1 region that affected a large number of customers globally. More than 1,000 apps and websites were impacted including major banks and popular games, streaming and social platforms such as WhatsApp, Snapchat, Fortnite and Pok\u00e9mon Go. The incident was widely [&hellip;]<\/p>\n","protected":false},"author":56,"featured_media":38580,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"_lmt_disableupdate":"yes","_lmt_disable":"","footnotes":""},"categories":[391],"tags":[634,2036,116,2034],"class_list":["post-38558","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-aws-monitoring","tag-amazon-aws","tag-aws-outage","tag-cloud-outage","tag-cloud-outages"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Detecting an AWS Outage and DR Lessons | eG Innovations<\/title>\n<meta name=\"description\" content=\"Learn how proactive monitoring can detect if AWS is down and how SaaS providers can navigate AWS outages and SLA adherence when CloudWatch fails.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.eginnovations.com\/blog\/aws-outage\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Detecting an AWS Outage and DR Lessons | eG Innovations\" \/>\n<meta property=\"og:description\" content=\"Learn how proactive monitoring can detect if AWS is down. Learn how SaaS providers can navigate AWS outages and quantify their SLA adherence when CloudWatch fails.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.eginnovations.com\/blog\/aws-outage\/\" \/>\n<meta property=\"og:site_name\" content=\"eG Innovations\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/eGInnovations\" \/>\n<meta property=\"article:published_time\" content=\"2025-11-10T17:49:15+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-11-11T12:13:42+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/outage-Social-banner.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Karthik G\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@eginnovations\" \/>\n<meta name=\"twitter:site\" content=\"@eginnovations\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Karthik G\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Detecting an AWS Outage and DR Lessons | eG Innovations","description":"Learn how proactive monitoring can detect if AWS is down and how SaaS providers can navigate AWS outages and SLA adherence when CloudWatch fails.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/","og_locale":"en_US","og_type":"article","og_title":"Detecting an AWS Outage and DR Lessons | eG Innovations","og_description":"Learn how proactive monitoring can detect if AWS is down. Learn how SaaS providers can navigate AWS outages and quantify their SLA adherence when CloudWatch fails.","og_url":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/","og_site_name":"eG Innovations","article_publisher":"https:\/\/www.facebook.com\/eGInnovations","article_published_time":"2025-11-10T17:49:15+00:00","article_modified_time":"2025-11-11T12:13:42+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/outage-Social-banner.jpg","type":"image\/jpeg"}],"author":"Karthik G","twitter_card":"summary_large_image","twitter_creator":"@eginnovations","twitter_site":"@eginnovations","twitter_misc":{"Written by":"Karthik G","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/#article","isPartOf":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/"},"author":{"name":"Karthik G","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/person\/5a77adbb080dfff7dd1b8cc41beb0da8"},"headline":"Detecting an AWS Outage and DR Lessons","datePublished":"2025-11-10T17:49:15+00:00","dateModified":"2025-11-11T12:13:42+00:00","mainEntityOfPage":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/"},"wordCount":1362,"publisher":{"@id":"https:\/\/www.eginnovations.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/#primaryimage"},"thumbnailUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/outage-Thumbnail-banner.jpg","keywords":["Amazon AWS","AWS outage","Cloud Outage","Cloud outages"],"articleSection":["AWS Monitoring"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/","url":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/","name":"Detecting an AWS Outage and DR Lessons | eG Innovations","isPartOf":{"@id":"https:\/\/www.eginnovations.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/#primaryimage"},"image":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/#primaryimage"},"thumbnailUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/outage-Thumbnail-banner.jpg","datePublished":"2025-11-10T17:49:15+00:00","dateModified":"2025-11-11T12:13:42+00:00","description":"Learn how proactive monitoring can detect if AWS is down and how SaaS providers can navigate AWS outages and SLA adherence when CloudWatch fails.","breadcrumb":{"@id":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.eginnovations.com\/blog\/aws-outage\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/#primaryimage","url":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/outage-Thumbnail-banner.jpg","contentUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2025\/10\/outage-Thumbnail-banner.jpg","width":362,"height":235},{"@type":"BreadcrumbList","@id":"https:\/\/www.eginnovations.com\/blog\/aws-outage\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.eginnovations.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Detecting an AWS Outage and DR Lessons"}]},{"@type":"WebSite","@id":"https:\/\/www.eginnovations.com\/blog\/#website","url":"https:\/\/www.eginnovations.com\/blog\/","name":"eG Innovations","description":"IT Performance Monitoring Insights","publisher":{"@id":"https:\/\/www.eginnovations.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.eginnovations.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.eginnovations.com\/blog\/#organization","name":"eG Innovations","alternateName":"eg innovations","url":"https:\/\/www.eginnovations.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2014\/07\/eg-logo-dark-gray1_new.jpg","contentUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2014\/07\/eg-logo-dark-gray1_new.jpg","width":362,"height":235,"caption":"eG Innovations"},"image":{"@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/eGInnovations","https:\/\/x.com\/eginnovations"]},{"@type":"Person","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/person\/5a77adbb080dfff7dd1b8cc41beb0da8","name":"Karthik G","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/f6337ed9440ea6893a5329850942535b?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/f6337ed9440ea6893a5329850942535b?s=96&d=mm&r=g","caption":"Karthik G"},"url":"https:\/\/www.eginnovations.com\/blog\/author\/karthik-g\/"}]}},"modified_by":"eG Innovations","_links":{"self":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/posts\/38558","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/users\/56"}],"replies":[{"embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/comments?post=38558"}],"version-history":[{"count":0,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/posts\/38558\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/media\/38580"}],"wp:attachment":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/media?parent=38558"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/categories?post=38558"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/tags?post=38558"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}