{"id":18902,"date":"2021-12-09T07:49:23","date_gmt":"2021-12-09T12:49:23","guid":{"rendered":"https:\/\/www.eginnovations.com\/blog\/?p=18902"},"modified":"2025-07-04T05:02:09","modified_gmt":"2025-07-04T09:02:09","slug":"cloud-performance-issues","status":"publish","type":"post","link":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/","title":{"rendered":"Case Study: AWS Cloud  Application Performance Troubleshooting"},"content":{"rendered":"<div class=\"inner_content\">\n<h2><span class=\"ez-toc-section\" id=\"How_a_full_stack_monitoring_solution_helped_our_customer_with_Application_Performance_Troubleshooting_on_AWS_Cloud\"><\/span>How a full stack monitoring solution helped our customer with Application Performance Troubleshooting on AWS Cloud<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<div class=\"link_list_style\" style=\"margin: 25px auto 20px; padding: 20px 20px 10px;\">\n<h4 style=\"margin-bottom: -10px;\"><span class=\"ez-toc-section\" id=\"Summary\"><\/span><strong>Summary<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h4>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-19508 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/summary-icon1.png\" alt=\"AWS myth\" width=\"85\" height=\"85\" border=\"0\" \/><br \/>\nHere&#8217;s a myth that needs to be debunked &#8211; the cloud (e.g., AWS or Azure) will take care of my performance problems!<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-19509 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/summary-icon2.png\" alt=\"AI powered AWS monitoring\" width=\"85\" height=\"85\" border=\"0\" \/><\/p>\n<p>Our experience shows that cloud architecture usually introduces new layers of complexities that did not exist in the on-premises world. You need a modern AI-powered full stack monitoring solution to find the needle in the multi-layered haystack that is the cloud.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19510 alignright\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/summary-icon3.png\" alt=\"\" width=\"85\" height=\"85\" border=\"0\" \/><\/p>\n<p>Sometimes, it&#8217;s the cloud vendor who has to fix the issue. An example could be a noisy or defective physical host OS that you have no access or visibility into. You need the right information in the form of logs, metrics, traces and events to substantiate with evidence in conversations with the cloud provider&#8217;s support team.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19511 alignright\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/summary-icon4.png\" alt=\"\" width=\"85\" height=\"85\" border=\"0\" \/><\/p>\n<p>In this blog post, we describe a problem analysis and anomaly detection process for a cloud performance problem (high CPU in the JVM and SSL issues) that we encountered recently when working with a large customer, who had a significant footprint on AWS cloud.<\/p>\n<\/div>\n<h2><span class=\"ez-toc-section\" id=\"The_Cloud_Does_Not_Make_Application_Performance_Monitoring_and_Troubleshooting_Simpler\"><\/span>The Cloud Does Not Make Application Performance Monitoring and Troubleshooting Simpler<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-18908 alignright\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/noteasy-cloud.jpg\" alt=\"Cloud is Not Easy\" width=\"300\" height=\"200\" border=\"0\" \/>Cloud migration and digital transformation of business-critical applications on the cloud are on the rise. At the same time, some IT executives have the misconception that when applications are migrated to the cloud, monitoring their performance becomes easier. The cloud service provider (e.g., Amazon) takes care of most of the key elements that support the application, and hence, they believe that there are fewer challenges when applications are on the cloud vs. being deployed on-premises.<\/p>\n<p>This is often far from true.<\/p>\n<p>This case study exposes the challenges that organizations deploying applications on the cloud face when performance issues happen and how difficult it can be to diagnose such problems. While migration to the cloud does simplify and automate several routine administration activities, performance monitoring is one area that is not necessarily simplified!<\/p>\n<p>The scenario covered may be useful for those evaluating the level of insight and troubleshooting capabilities various cloud tools can offer.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Target_Application_and_Infrastructure_on_AWS_Cloud\"><\/span>Target Application and Infrastructure on AWS Cloud<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-18909 alignright\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/aws-logos.jpg\" alt=\"AWS. Java and SQL Server logos\" width=\"300\" height=\"100\" border=\"0\" \/><\/p>\n<p>Our customer is a mid-sized business that offers SaaS services to clients. The SaaS application was Java-based and was hosted on AWS cloud, and relied on AWS RDS for the backend database. eG Enterprise was used by the customer to monitor the SaaS application and the AWS services in use.<\/p>\n<p>The application had been operational for several months and there had been no complaints about performance.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"The_Initial_Alert\"><\/span>The Initial Alert<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>The CPU usage of the <a href=\"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/\">AWS EC2 VM<\/a> hosting the key application had spiked all of a sudden around 5 am and remained high for several hours:<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/cpu-usage-view.jpg\" data-rel=\"lightbox-image-0\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18911 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/cpu-usage.jpg?noresize\" alt=\"CPU Usage of AWS EC2 instance hosting the key application\" width=\"750\" height=\"324\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 1: CPU usage of the VM had spiked suddenly<\/div>\n<p>At this time, application performance also suffered. This was evident from <a href=\"https:\/\/www.eginnovations.com\/synthetic-monitoring\">eG Enterprise\u2019s synthetic monitoring capability<\/a> that the customer had configured to proactively detect issues before real users experience issues. Initially when the problem started, application response time was poor. Over time, you see gaps in the response time in the graph in Figure 2 because the application became unavailable.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/application-response-time-view.jpg\" data-rel=\"lightbox-image-1\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18912 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/application-response-time.jpg?noresize\" alt=\"Application response vs. time graph \" width=\"750\" height=\"324\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 2: Response time of the application, measured using synthetic monitoring<\/div>\n<p>Below is the graph of application availability during the same time scale. You can clearly see TCP connection availability dropping to zero several times indicating that users were not able to connect to the application. This information was also highlighted in critical alerts sent to administrators and displayed on the overview dashboards in eG Enterprise to enable actionable notification.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/application-availability-view.jpg\" data-rel=\"lightbox-image-2\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18913 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/application-availability.jpg?noresize\" alt=\"AWS Application availability graph\" width=\"750\" height=\"324\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 3: Application availability measured using synthetic monitoring<\/div>\n<h3><span class=\"ez-toc-section\" id=\"Analyzing_Application_Performance_in_Detail\"><\/span>Analyzing Application Performance in Detail<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>To further analyze the problem, we analyzed the application performance in detail. Figure 4 below shows the <a href=\"https:\/\/www.eginnovations.com\/supported-technologies\/jvm-monitoring\">CPU usage of the JVM (Java Virtual Machine)<\/a> used by the application on the problematic AWS VM. The JVM CPU usage follows the same pattern as the VM\u2019s CPU usage, indicating that the application\u2019s JVM had been affected.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/cpu-usage-jvm-view.jpg\" data-rel=\"lightbox-image-3\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18916 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/cpu-usage-jvm.jpg?noresize\" alt=\"CPU usage of the application JVM on the AWS EC2 instance\" width=\"750\" height=\"325\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 4: CPU usage of the application&#8217;s JVM<\/div>\n<p>High CPU usage within the JVM can have numerous causes and often it can be a symptom of bugs within the Java code of the application. High CPU is a symptom &#8211; not a root cause. Reasons for high CPU could range from host OS issues, poorly sized JVMs, memory leaks to code-level deadlocks. My colleague has written a blog covering some common problems, see: <a href=\"https:\/\/www.eginnovations.com\/blog\/troubleshoot-java-cpu-issues\/\" rel=\"noopener noreferrer\">How to Troubleshoot Java CPU Usage Issues | JVM High CPU Threads (eginnovations.com).<\/a><\/p>\n<div class=\"link_list_style\" style=\"margin: 0px auto 20px; padding: 20px 20px 10px;\">\n<h4><span class=\"ez-toc-section\" id=\"The_impact_of_Java_Garbage_Collection_on_application_performance\"><\/span><strong>The impact of Java Garbage Collection on application performance<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h4>\n<p>Java Garbage Collection (GC) is intrinsically a CPU intensive operation.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-6606\" style=\"margin-top: -10px;\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/garbage-collection.png\" alt=\"Excessive Garbage Collection\" width=\"85\" height=\"85\" border=\"0\" \/>When Java GC happens, all application threads are paused. Most Java GC algorithms (including the latest G1GC) must halt all application threads, a process referred to as stopping-the-world (STW) or pausing (a GC pause).<\/p>\n<p>The JVM (Java virtual machine) takes over all the computer&#8217;s cores to perform GC and reclaims memory before restarting the application threads.<\/p>\n<p><strong>Best Practice:<\/strong> Look for full stack monitoring tools that can help you diagnose high CPU root-cause across various layers and tiers \u2013 Operating System, JVM (threads\/ heap), application code and connect the dots to the affected business transactions so you can quantify business impact. Full stack tools can correlate GC activity at the JVM level, thread activity at the code level and CPU utilization% even at a request-by-request level. This will help you triage faster and engage the right team to fix the issue.<\/p>\n<\/div>\n<p>The <a href=\"https:\/\/www.eginnovations.com\/blog\/what-is-garbage-collection-java\/\/\">JVM Garbage Collector (GC)<\/a> can also be a source of high CPU, especially if application memory leaks are at play. The JVM knowing that it is running low on key resources, such as heap memory can get itself into a state where it keeps trying to desperately reclaim memory. A quick check eliminated the possibility of a GC issue in this case. See Figure 5, which shows the historical usage of GC in the JVM. The percentage of time spent by the JVM on GC activities had not changed significantly during the problematic period. Hence, GC activities were not the reason why the application\u2019s CPU usage had spiked.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/java-application-activity-view.jpg\" data-rel=\"lightbox-image-4\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18918 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/java-application-activity.jpg?noresize\" alt=\"GC activity of the Java application\" width=\"750\" height=\"309\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 5: GC activity of the Java application indicated no anomalies<\/div>\n<p>Java threading issues are a common problem that cause application problems. So, the next step was to check if there were common threading issues present, and if so, were these root cause issues or simply manifestations and symptoms of issues elsewhere. Unfortunately, both are common possibilities!<\/p>\n<p>Figure 6 shows the total number of threads in the JVM over time. While there had been an increase, the increase was not too significant. This indicated that application processing in the JVM was not the issue.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/application-jvm-threads-view.jpg\" data-rel=\"lightbox-image-5\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18919 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/application-jvm-threads.jpg?noresize\" alt=\"Application JVM threads\" width=\"750\" height=\"325\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 6: Tracking the number of threads in the application&#8217;s JVM<\/div>\n<div class=\"link_list_style\" style=\"margin: 0px auto 20px; padding: 1px 20px 15px;\">\n<h3><span class=\"ez-toc-section\" id=\"3_Use_Cases_for_which_continuous_thread_analysis_is_key\"><\/span>3 Use Cases for which continuous thread analysis is key<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-19148\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/lock.png\" alt=\"\" width=\"85\" height=\"85\" border=\"0\" \/><strong>Use case #1:<\/strong> Identify the dreaded Achilles heel of threaded applications \u2013 locks and deadlocks that impact your scalability.<\/p>\n<p><strong>Use case #2:<\/strong> Automatically identify CPU-hungry threads and thread groups. On the cloud, CPU is money.<\/p>\n<p><strong>Use case #3:<\/strong> Pinpoint the root-cause of the thread anomalies to the specific processes and microservices so the right team can be alerted to fix the issue.<\/p>\n<\/div>\n<p>The Java Thread Analysis modules within eG Enterprise further indicated that no specific thread in the JVM was taking a lot of CPU. There were many threads, each taking a small amount of CPU \u2013 1-2%, but a number of these threads caused the overall CPU to be high.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/thread-diagnosis-zoom.jpg\" data-rel=\"lightbox-image-6\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18920 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/thread-diagnosis.jpg?nor\" alt=\"Thread diagnosis in the JVM\" width=\"800\" height=\"600\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 7: Details of threads in the JVM<\/div>\n<p>The auto-baselined thresholds for blocked threads had however triggered critical alerts when the problem happened. Reviewing the historical data, we could see that there was some significant thread blocking in the JVM, which was abnormal and anomalous. A quick check in the detailed diagnosis tool in eG Enterprise (see Figure 8) revealed there were issues in SSL processing at the JVM level \u2013 not in the application code.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/thread-blocking-zoom.jpg\" data-rel=\"lightbox-image-7\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18922 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/thread-blocking.jpg?nor\" alt=\"Thread blocking in JVM\" width=\"750\" height=\"376\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 8: Detailed diagnosis shows that thread blocking happened in the JVM<\/div>\n<p>Clicking through to identify the blocking thread, showed that SSL Memory Caching in the JVM seemed to have triggered the issue (see Figure 9). Many Java threads were stuck in SSL processing \u2013 and it was this that was consuming excessive CPU.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/thread-blocking-synchronized-access-ssl-view.jpg\" data-rel=\"lightbox-image-8\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18923 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/thread-blocking-synchronized-access-ssl.jpg?noresize\" alt=\"Thread blocking caused by synchronized access to SSL memory cache in the JVM\" width=\"750\" height=\"440\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 9: Thread blocking caused by synchronized access to the SSL Memory Cache in the JVM<\/div>\n<p style=\"margin-bottom: 15px;\">By simply googling the blocked class\/method, i.e., \u201csun.security.util.MemoryCache.put\u201d, one can find a number of links that point to known Java issues:<\/p>\n<ul>\n<li><a href=\"https:\/\/bugs.java.com\/bugdatabase\/view_bug?bug_id=8259886\" target=\"blank\" rel=\"noopener noreferrer\">https:\/\/bugs.java.com\/bugdatabase\/view_bug?bug_id=8259886<\/a><\/li>\n<li><a href=\"https:\/\/access.redhat.com\/solutions\/4056181\" target=\"blank\" rel=\"noopener noreferrer\">https:\/\/access.redhat.com\/solutions\/4056181<\/a><\/li>\n<li><a href=\"https:\/\/bugs.java.com\/bugdatabase\/view_bug?bug_id=8218415\" target=\"blank\" rel=\"noopener noreferrer\">https:\/\/bugs.java.com\/bugdatabase\/view_bug?bug_id=8218415<\/a><\/li>\n<\/ul>\n<p>Vendor-recommended changes, such as setting different cache sizes and timeouts were attempted but the problem was not getting resolved.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Perplexed_Where_is_the_Root-Cause\"><\/span>Perplexed! Where is the Root-Cause?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>At this point, the customer asked for our advice as they were stumped. We performed some really obvious sanity checks. Often, a customer says they \u201chaven\u2019t changed anything\u201d but when you check the <a href=\"https:\/\/www.eginnovations.com\/product\/capabilities\/change-configuration-tracking\" rel=\"noopener noreferrer\">eG Enterprise configuration database<\/a> and historical data, it tells a very different story! In this case though absolutely nothing had changed in the application for several weeks. The application code had been working fine for months. No patches had been deployed at the application level or the OS level. It was all rather mysterious!<\/p>\n<p>Could there have been an SSL attack? No \u2013 TCP connections\u2019 activity didn\u2019t change by much (see Figure 10):<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/connection-activity-view.jpg\" data-rel=\"lightbox-image-9\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18924 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/connection-activity.jpg?noresize\" alt=\"TCP Connection activity to an application\" width=\"750\" height=\"325\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 10: TCP connection activity to the application<\/div>\n<p>We checked what other alerts had been triggered around the time of the incident. BINGO! We got our clue &#8211; TCP retransmissions had increased significantly around the same time (see Figure 11).<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/tcp-retransmissions-view.jpg\" data-rel=\"lightbox-image-10\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18925 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/tcp-retransmissions.jpg?noresize\" alt=\"TCP retransmissions in an AWS EC2 instance\" width=\"750\" height=\"325\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 11: TCP retransmissions from the application server &#8211; culprit found!<\/div>\n<p>1 to 5 segments retransmitted per second used to be normal for this application. Observe that the value had risen to over 200 segments\/sec when the problem happened (see Figure 11). That\u2019s a 40-fold increase! There was also a clear correlation between TCP Retransmissions increased right around the time system CPU shot up. eG Enterprise uses historical data within an AIOps engine to set meaningful thresholds to enable anomaly detection. By learning what is normal for an application or infrastructure, anomalies within dynamic environments can often be detected early <a href=\"https:\/\/www.eginnovations.com\/blog\/aiops-tools-capabilities\/\" rel=\"noopener noreferrer\">(read more).<\/a><\/p>\n<div class=\"link_list_style\" style=\"margin: 0px auto 20px; padding: 1px 20px 10px;\">\n<h3><span class=\"ez-toc-section\" id=\"A_primer_on_TCP_packet_retransmissions_and_how_they_impact_system_performance%E2%80%8B\"><\/span>A primer on TCP packet retransmissions and how they impact system performance\u200b<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p style=\"margin-bottom: 15px; font-size: 20px;\"><strong>What are TCP Retransmissions?\u200b<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-19065 size-full\" style=\"margin-top: 0px;\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/retransmission.png?nores\" alt=\"TCP communication\" width=\"85\" height=\"85\" border=\"0\" \/><\/p>\n<p>The TCP enables two hosts to establish a connection and exchange streams of data. TCP guarantees delivery of data and also guarantees that packets will be delivered in the same order in which they were sent.\u200b<\/p>\n<p>TCP retransmission refers to the process of resending packets over the network that have been either lost or damaged. Retransmission is a mechanism used by TCP to provide reliable communication.<\/p>\n<p style=\"margin-bottom: 15px; font-size: 20px;\"><strong>When does retransmission happen?\u200b<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-19067 size-full\" style=\"margin-top: -5px;\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/when-retransmission.png?nores\" alt=\"TCP retransmission\" width=\"85\" height=\"85\" border=\"0\" \/>Retransmission happens when the TCP receiver determines that an error has occurred during communication and subsequently does not transmit an \u201cACK\u201d (acknowledgment) to the sender.<\/p>\n<p>The sender will then retransmit the lost or damaged packet. Once the receiver determines that it has received a packet successfully, an \u201cACK\u201d will be sent to the sender.\u200b<\/p>\n<p style=\"margin-bottom: 15px; font-size: 20px;\"><strong>Why do TCP retransmission happen?\u200b<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-19068 size-full\" style=\"margin-top: 0px;\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/why-retransmission.png?nores\" alt=\"Why TCP retransmissions happen\" width=\"130\" height=\"85\" border=\"0\" \/><\/p>\n<p>TCP retransmissions can be caused by a number of networking issues: \u200b<\/p>\n<ul>\n<li>Poor or lossy network connection is a common cause.\u200b<\/li>\n<li>Faulty NIC card or driver on the sender or recipient OS can result in packet losses.\u200b<\/li>\n<li>Issues with firewalls and proxies that lie on the path between the sender and the receiver can cause retransmissions.\u200b<\/li>\n<li>When a router on the intervening network path is heavily loaded, it might have buffer overruns leading to lost packets.\u200b<\/li>\n<li>Network congestion in a LAN can also cause network packet loss.\u200b<\/li>\n<li>Different TCP segments from the sender can take different routes to reach the receiver and the delays between the routes could be so significant that the receiver believes it cannot handle a large number of out of order packets.\u200b<\/li>\n<li>In a virtual environment, hypervisor issues can also lead to packet loss during VM-to-VM communication.<\/li>\n<\/ul>\n<p style=\"margin-bottom: 15px; font-size: 20px;\"><strong>How do they impact system performance?\u200b<\/strong><\/p>\n<p>When packet loss over the network is significant, it can result in several abnormalities:\u200b<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19066 alignright\" style=\"margin-top: 15px;\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/system-performance.png?nores\" alt=\"\" width=\"85\" height=\"85\" border=\"0\" \/>Existing SSL connections could be dropped, and new connections must be established. This adds to latency and also processing overhead on the endpoints. SSL\/ TLS handshakes which have to happen for each new connection involve certificate exchanges between the endpoints and they take up CPU resources. \u200b<\/p>\n<p>\u200bApplications on the end points may have a cache of SSL connections currently established. When network issues happen, they cause connections to linger for longer. Connections could also stay in the negotiation phase for longer. This could result in the memory cache of SSL connections on the endpoints being larger than usual. This could result in increased memory requirements on the end points. As the memory requirement increases, it could trigger garbage collection which could trigger CPU usage. Also, larger the memory cache, more time it takes for concurrent accesses to the cache and to reorganize the cache when a connection starts or ends. This also adds to the CPU usage on the endpoints.\u200b<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-19207\" style=\"margin-top: -14px;\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/best-practies.png\" alt=\"\" width=\"85\" height=\"85\" border=\"0\" \/><strong>Best practice: Track logs, metrics, traces and events in a unified manner, so you get the complete picture of performance anomalies in a system.<\/strong><\/p>\n<\/div>\n<p>The customer had a premier support contract with the cloud service provider. When they contacted the service provider\u2019s support desk, they were told that there were no issues at their end. When the customer provided the data collected to show the excessive TCP retransmissions, the support desk suggested a system reboot to force the VM to move from one physical host to another.<\/p>\n<p>And just like that, the problem went away! Immediately, TCP retransmissions dropped and CPU usage went down (see Figure 12).<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/cpu-usage1-view.jpg\" data-rel=\"lightbox-image-11\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18927 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/cpu-usage1.jpg?noresize\" alt=\"Java CPU usage over time\" width=\"750\" height=\"325\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 12: CPU usage of the application dropped immediately after the VM was moved to a new host<\/div>\n<p>Just to check, when the VM was moved back to the original host, the problem returned again (see Figure 13). Based on this behavior, we suspect that it may have been a malfunctioning NIC card on the physical server, or a driver issue on that server rather than a wider networking issue in the cloud provider\u2019s data center.<\/p>\n<p><a href=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/tcp-retransmission1-view.jpg\" data-rel=\"lightbox-image-12\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-18928 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/tcp-retransmission1.jpg?noresize\" alt=\"TCP retransmission diagram\" width=\"750\" height=\"325\" border=\"0\" \/><\/a><\/p>\n<div class=\"img_caption\">Figure 13: TCP retransmissions dropped after the VM was moved from one host to another<\/div>\n<div class=\"link_list_style\" style=\"margin: 0px auto 20px; padding: 20px 20px 13px;\">\n<ul class=\"hand-color\" style=\"margin-bottom: 0px;\">\n<li>Most on-prem admins will have experience with weird\/challenging VMware ESXi, Citrix Hypervisor, or Microsoft Hyper-V bugs with intermittent or perplexing symptoms that impact application performance. <strong>Imagine trying to debug such an issue with no access to the hypervisor \u2013 that\u2019s what you may have to do on public cloud!<\/strong><\/li>\n<\/ul>\n<\/div>\n<p>Based on the analysis of this problem, the customer submitted a helpdesk report to the cloud provider with evidence from eG Enterprise to get a credit for the hours when the application performance issue had occurred.<\/p>\n<div class=\"link_list_style\" style=\"margin: 0px auto 20px; padding: 1px 20px 10px;\">\n<h3><span class=\"ez-toc-section\" id=\"Incident_postmortem_of_the_application_performance_anomaly\"><\/span>Incident postmortem of the application performance anomaly<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ol>\n<li><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-19147\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/11\/finger.png\" alt=\"\" width=\"85\" height=\"85\" border=\"0\" \/>There was a network issue on the physical host used by the cloud service provider. This possibly affected TCP connection handling on the VMs hosted on the host.<\/li>\n<li>The network issue caused retransmissions at the TCP level.<\/li>\n<li>Retransmission during SSL handshakes meant there were more SSL connections waiting to be processed in the application stack.<\/li>\n<li>When the SSL connection cache in the JVM became too high, it caused synchronization issues in the JVM. This caused CPU usage to spike up and the application became slow and unavailable at times.<\/li>\n<li>When the network issue was circumvented by moving to another host, application performance was back to normal.<\/li>\n<\/ol>\n<\/div>\n<h2><span class=\"ez-toc-section\" id=\"Application_Performance_Troubleshooting_Key_Takeaways\"><\/span>Application Performance Troubleshooting: Key Takeaways<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"margin-bottom: 15px;\">The real-life story we have described here highlights the challenges that organizations that are migrating applications to the cloud face. Here are my five key takeaways regarding application performance on the cloud:<\/p>\n<ol>\n<li><strong>Operating applications in cloud environment is challenging \u2013 you no longer have complete visibility.<\/strong> When there is an issue, you often hear \u201cit\u2019s not us\u201d from your cloud service provider. If you are thinking \u201cI will go to the cloud and won\u2019t have any performance issues anymore&#8221;, think again.<\/li>\n<li><strong>You must have full stack visibility.<\/strong> You can\u2019t just monitor the application alone. In a cloud environment, you need as much proof as possible when you speak to your cloud service provider.<\/li>\n<li><strong>Monitor as many parameters as possible.<\/strong> You never know where you will get a clue to help diagnose a problem. We knew TCP retransmissions tend to affect application performance, however we didn\u2019t necessarily anticipate such a huge CPU impact because of them. Tools that selectively pick a handful of KPIs and report on them will catch the obvious issues or more common root causes that you yourself might find out in a few minutes. You need as much visibility as possible, so that you can provide proof when you contact your cloud service provider.<\/li>\n<li><strong>Historical insights are extremely important.<\/strong> You are often asked \u201cwhat changed\u201d \u2013 it is important to track config changes and to know what was updated. Your application needs to have audit logging so that you know what changed within the application. You need to <a href=\"https:\/\/www.eginnovations.com\/product\/capabilities\/change-configuration-tracking\">monitor the application config and OS config<\/a> so that you know what patches, hot fixes, or config changes were made, so you can correlate any performance issues with config changes. At the same time, you need to also have usage and performance baselines for your infrastructure to know what normal usage and performance looks like. At several times during our analysis, we checked on these statistics and used them for anomaly detection.<\/li>\n<li><strong>And YES sometimes \u2013 It&#8217;s not you, it\u2019s the cloud!<\/strong><\/li>\n<\/ol>\n<p><a href=\"https:\/\/www.eginnovations.com\/product\/application-performance-monitoring\/free-trial\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-19602 size-full\" src=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWS-cloud-banner.png\" alt=\"Full stack monitoring of AWS infra and applications\" width=\"850\" height=\"150\" border=\"0\" srcset=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWS-cloud-banner.png 850w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWS-cloud-banner-300x53.png 300w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWS-cloud-banner-768x136.png 768w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWS-cloud-banner-800x141.png 800w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWS-cloud-banner-310x55.png 310w, https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWS-cloud-banner-140x25.png 140w\" sizes=\"auto, (max-width: 850px) 100vw, 850px\" \/><\/a><\/p>\n<h3><span class=\"ez-toc-section\" id=\"Further_reading\"><\/span>Further reading<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li>If you enjoyed this Postmortem blog post \u2013 you may enjoy this similar one, <a href=\"https:\/\/www.eginnovations.com\/blog\/troubleshooting-web-application-performance\/\" rel=\"noopener noreferrer\">Troubleshooting Web Application Performance &amp; SSL Issues<\/a><\/li>\n<li>An overview of <a href=\"https:\/\/www.eginnovations.com\/supported-technologies\/java-application-monitoring\" rel=\"noopener noreferrer\">Java Performance Monitoring Tools,<\/a> which enable you to prioritize problems automatically and provide actionable notifications<\/li>\n<li><a href=\"https:\/\/www.eginnovations.com\/blog\/top-10-java-performance-problems\/\" rel=\"noopener noreferrer\">Top 10 Java Performance Problems<\/a> &#8211; An in-depth guide to the most common Java issues and identifying them<\/li>\n<li><a href=\"https:\/\/www.eginnovations.com\/white-paper\/ssl-monitoring\">Monitoring SSL Certificates in Business-Critical Applications<\/a> (eginnovations.com)<\/li>\n<li>Section 9 &#8220;Monitoring TCP Activity&#8221; in our troubleshooting guide details debugging and understanding TCP retransmission issues and their causes, see: <a href=\"https:\/\/www.eginnovations.com\/blog\/server-performance-monitoring\/\" rel=\"noopener noreferrer\">Server Performance Monitoring \u2013 KPIs &amp; Metrics<\/a><\/li>\n<li>My previous deep-dive post-mortem blog post \u2013 debugging slow performance on AWS public cloud burstable instances on EC2, see: <a href=\"https:\/\/www.eginnovations.com\/blog\/aws-ec2-monitoring-tools\/\" rel=\"noopener noreferrer\">AWS EC2 Monitoring Tools | eG Innovations<\/a><\/li>\n<li>More on how eG Enterprise leverages AIOps technologies for anomaly detection: <a href=\"https:\/\/www.eginnovations.com\/blog\/aiops-tools-capabilities\/\" rel=\"noopener noreferrer\">AIOps Tools \u2013 8 Proactive Monitoring Tips | eG Innovations<\/a><\/li>\n<\/ul>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>How a full stack monitoring solution helped our customer with Application Performance Troubleshooting on AWS Cloud Summary Here&#8217;s a myth that needs to be debunked &#8211; the cloud (e.g., AWS or Azure) will take care of my performance problems! Our experience shows that cloud architecture usually introduces new layers of complexities that did not exist [&hellip;]<\/p>\n","protected":false},"author":8,"featured_media":20688,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"_lmt_disableupdate":"no","_lmt_disable":"","footnotes":""},"categories":[371,391,369,27],"tags":[546,587,759,395,626,1434,545,110,1440,641,642,636,1441],"class_list":["post-18902","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-application-performance-monitoring-apm","category-aws-monitoring","category-cloud-monitoring","category-java-monitoring","tag-anomaly-detection","tag-application-performance","tag-application-performance-troubleshooting","tag-aws","tag-aws-cloud","tag-aws-monitoring-tools","tag-aws-performance","tag-cloud","tag-cloud-monitoring-tools","tag-cloud-performance-issue","tag-cloud-performance-troubleshooting","tag-observability","tag-troubleshooting-aws"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>AWS Cloud Case Study: Troubleshooting Application Performance<\/title>\n<meta name=\"description\" content=\"Experiencing application issues on AWS? A real-world example of AWS Cloud application troubleshooting using a full-stack monitoring tool.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"AWS Cloud Performance Anomaly Detection - A Real-life Case Study\" \/>\n<meta property=\"og:description\" content=\"How a full stack monitoring helped our customer pinpoint root-cause of slowness in AWS Cloud with observability: metrics, events, logs and traces.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/\" \/>\n<meta property=\"og:site_name\" content=\"eG Innovations\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/eGInnovations\" \/>\n<meta property=\"article:published_time\" content=\"2021-12-09T12:49:23+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-07-04T09:02:09+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWSAnomaly-Social-Banner.jpg\" \/>\n<meta name=\"author\" content=\"Arun Aravamudhan\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:title\" content=\"AWS Cloud Performance Anomaly Detection - A Real-life Case Study\" \/>\n<meta name=\"twitter:description\" content=\"How a full stack monitoring helped our customer pinpoint root-cause of slowness in AWS Cloud with observability: metrics, events, logs and traces.\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWSAnomaly-Social-Banner.jpg\" \/>\n<meta name=\"twitter:creator\" content=\"@https:\/\/x.com\/perfclarity\" \/>\n<meta name=\"twitter:site\" content=\"@eginnovations\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Arun Aravamudhan\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"19 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"AWS Cloud Case Study: Troubleshooting Application Performance","description":"Experiencing application issues on AWS? A real-world example of AWS Cloud application troubleshooting using a full-stack monitoring tool.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/","og_locale":"en_US","og_type":"article","og_title":"AWS Cloud Performance Anomaly Detection - A Real-life Case Study","og_description":"How a full stack monitoring helped our customer pinpoint root-cause of slowness in AWS Cloud with observability: metrics, events, logs and traces.","og_url":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/","og_site_name":"eG Innovations","article_publisher":"https:\/\/www.facebook.com\/eGInnovations","article_published_time":"2021-12-09T12:49:23+00:00","article_modified_time":"2025-07-04T09:02:09+00:00","og_image":[{"url":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWSAnomaly-Social-Banner.jpg","type":"","width":"","height":""}],"author":"Arun Aravamudhan","twitter_card":"summary_large_image","twitter_title":"AWS Cloud Performance Anomaly Detection - A Real-life Case Study","twitter_description":"How a full stack monitoring helped our customer pinpoint root-cause of slowness in AWS Cloud with observability: metrics, events, logs and traces.","twitter_image":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWSAnomaly-Social-Banner.jpg","twitter_creator":"@https:\/\/x.com\/perfclarity","twitter_site":"@eginnovations","twitter_misc":{"Written by":"Arun Aravamudhan","Est. reading time":"19 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/#article","isPartOf":{"@id":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/"},"author":{"name":"Arun Aravamudhan","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/person\/d788cb81df96a940429c3f5a3b294a6a"},"headline":"Case Study: AWS Cloud Application Performance Troubleshooting","datePublished":"2021-12-09T12:49:23+00:00","dateModified":"2025-07-04T09:02:09+00:00","mainEntityOfPage":{"@id":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/"},"wordCount":3019,"commentCount":0,"publisher":{"@id":"https:\/\/www.eginnovations.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/#primaryimage"},"thumbnailUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWSAnomaly-Thumbnail.jpg","keywords":["Anomaly Detection","Application Performance","application performance troubleshooting","AWS","AWS Cloud","AWS monitoring tools","AWS Performance","cloud","cloud monitoring tools","Cloud performance issue","cloud performance troubleshooting","observability","troubleshooting aws"],"articleSection":["Application Performance Monitoring (APM)","AWS Monitoring","Cloud Monitoring","Java Monitoring"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/","url":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/","name":"AWS Cloud Case Study: Troubleshooting Application Performance","isPartOf":{"@id":"https:\/\/www.eginnovations.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/#primaryimage"},"image":{"@id":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/#primaryimage"},"thumbnailUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWSAnomaly-Thumbnail.jpg","datePublished":"2021-12-09T12:49:23+00:00","dateModified":"2025-07-04T09:02:09+00:00","description":"Experiencing application issues on AWS? A real-world example of AWS Cloud application troubleshooting using a full-stack monitoring tool.","breadcrumb":{"@id":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/#primaryimage","url":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWSAnomaly-Thumbnail.jpg","contentUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2021\/12\/AWSAnomaly-Thumbnail.jpg","width":362,"height":235},{"@type":"BreadcrumbList","@id":"https:\/\/www.eginnovations.com\/blog\/cloud-performance-issues\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.eginnovations.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Case Study: AWS Cloud Application Performance Troubleshooting"}]},{"@type":"WebSite","@id":"https:\/\/www.eginnovations.com\/blog\/#website","url":"https:\/\/www.eginnovations.com\/blog\/","name":"eG Innovations","description":"IT Performance Monitoring Insights","publisher":{"@id":"https:\/\/www.eginnovations.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.eginnovations.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.eginnovations.com\/blog\/#organization","name":"eG Innovations","alternateName":"eg innovations","url":"https:\/\/www.eginnovations.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2014\/07\/eg-logo-dark-gray1_new.jpg","contentUrl":"https:\/\/www.eginnovations.com\/blog\/wp-content\/uploads\/2014\/07\/eg-logo-dark-gray1_new.jpg","width":362,"height":235,"caption":"eG Innovations"},"image":{"@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/eGInnovations","https:\/\/x.com\/eginnovations"]},{"@type":"Person","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/person\/d788cb81df96a940429c3f5a3b294a6a","name":"Arun Aravamudhan","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.eginnovations.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/7ff42334d908fb4060880a4487331e4a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/7ff42334d908fb4060880a4487331e4a?s=96&d=mm&r=g","caption":"Arun Aravamudhan"},"sameAs":["https:\/\/www.linkedin.com\/in\/arun-aravamudhan\/","https:\/\/x.com\/https:\/\/x.com\/perfclarity"],"url":"https:\/\/www.eginnovations.com\/blog\/author\/arun-aravamudhan\/"}]}},"modified_by":"eG Innovations","_links":{"self":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/posts\/18902","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/comments?post=18902"}],"version-history":[{"count":0,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/posts\/18902\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/media\/20688"}],"wp:attachment":[{"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/media?parent=18902"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/categories?post=18902"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.eginnovations.com\/blog\/wp-json\/wp\/v2\/tags?post=18902"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}