How to Troubleshoot High Java CPU Usage Issues
If you have deployed a Java application in production, say on a Apache Tomcat container or created a micro-service with Spring Boot, you’ve probably encountered a situation where the application suddenly starts to take up a large amount of CPU. Java CPU usage may even reach 100% and you’d hear the system admin saying “Tomcat is taking all the CPU of the system”. When this happens, application response becomes sluggish and users complain about slowness when they access the Java application.
Often the solution to this problem is to restart the application and, lo and behold, the problem goes away – only to reappear a few days later. A key question then is: how to troubleshoot high CPU usage of a Java application and fix the problem once and for all?
What Causes High Java CPU Usage?
Java applications may take high CPU resources for many reasons:
- Poorly designed application code with inefficient or infinite loops
- Inefficient algorithms (poor application logic)
- Recursive method calls (causing hundreds of recursions)
- Inefficient usage of collections (e.g., performing a search on a large Vector with tens of thousands of elements, instead of using a HashMap for more efficient search) can also be reasons for high Java CPU usage.
- Interestingly (and very non-intuitively), shortage of memory in the Java Virtual Machine (JVM) can also reflect in high CPU usage. Instead of spending time in processing, the JVM spends more time in Garbage Collection, which in turn takes up CPU cycles.
- A JVM may max out on CPU usage because of the incoming workload. The server capacity may not be sized sufficiently to handle the rate of requests coming in and in such a situation, the Java application may be doing work, trying to keep up with the workload.
- Recalculation of values that have already been calculated is another cause of Java applications taking up CPU.
Restarting an application will not solve a CPU usage problem – it only mitigates the problem for a short while, until the problem reappears. It is, therefore, essential to identify the cause of the CPU spike: is it due to poorly designed application code, insufficient memory allocation, or an unexpectedly high workload?
JVM Monitoring Can Assist with Diagnosis of Java CPU Issues
Modern JVMs (1.5 and higher) support Java Management Instrumentation (JMX) APIs. According to Wikipedia, Java Management Extensions is a Java technology that supplies tools for managing and monitoring applications, system objects, devices and service-oriented networks. Those resources are represented by objects called MBeans (for Managed Bean). Managing and monitoring applications can be designed and developed using the Java Dynamic Management Kit.
Using JMX, Java monitoring tools can explore what threads are running in the JVM, the state of each thread, the CPU usage of each thread, etc. By periodically collecting these statistics, monitoring tools can correlate thread-level performance information with the CPU usage of the Java application and answer the question “Why is the Java application taking high CPU?“.
Figure 1 below depicts the eG Enterprise screen that monitors threads in a JVM. High and medium CPU threads are defined as threads that take up more than 50% CPU and 30-50% CPU respectively. It’s best to get alerts when CPU usage is at these levels before the Java application causes CPU usage to spike to 100%!
The existence of any high or medium CPU thread is indicative of an application bottleneck, i.e., a piece of inefficient code that is executing frequently and taking up CPU. In this example, there is one high CPU thread.
Detailed diagnosis of this metric (by clicking on the magnifying glass) reveals the stack trace – i.e., which line of code is the thread that is consuming high amounts of CPU usage.
If the thread is assigned a name in the application, the thread name is shown on the left-hand side of Figure 2 and the detailed stack trace is on the right-hand side. This information gives operations staff, service desk support, and developers exactly what they need to identify the cause of high CPU usage. The exact class, method and line of code can be determined.
In this example, the Java troubleshooting tool is telling the user to look in the com.zapstore.logic.LogicBuilder class, createLogic method and line number 223.
If the high Java CPU usage is due to an unexpected workload increase, you should see the number of threads increase. Even if each thread consumes a small amount of CPU, the aggregate may be significant.
Determine if High Java CPU Usage is From the Application or the JVM
What about a case where the Java application is still consuming high CPU, but none of the application threads are taking much CPU and the aggregate CPU usage of the application threads is low? In such cases, the suspicion may fall on the garbage collection activity in the JVM. Running this too frequently or too aggressively can cause troublesome CPU spikes. You may want to change the Java garbage collection algorithm or increase the heap and non-heap memory available to the JVM to alleviate the problem.
Historical information captured about the JVM’s CPU usage and individual threads’ CPU usage can be used to determine the real cause of the Java application’s high CPU usage. If you have this information available, you will no longer need to restart the application and hope that the problem goes away.
The historical insights (like shown below in Figure 3) will help you accurately determine the cause of CPU spikes and fix them, so you do not have to deal with the same issues ever again.
Enabling JMX for a JVM has minimal impact on its performance, which makes this technique of monitoring Java applications applicable even for production environments.
Get 360° Visibility and Insights into Java Application Performance
The performance of Java applications depends on three critical factors: the JVM, the Java web container (WebLogic, JBoss, Tomcat, etc.), and the application transactions performed on the front end by the business user.
The transactions are where the end-user will experience slowness or failure. This makes it imperative to trace the transactions in real-time to identify how they are being executed and where slowness occurs.
The JVM, as we saw earlier in this article, is a core piece of the Java stack. How CPU and memory are allocated, utilized and managed determines how efficient the application processing will be.
Finally, the Java web container (Tomcat, Spring Boot, WebLogic, JBoss, etc.), where the business logic for the execution of the application code resides is an important component of the application middleware. All these three components need to be monitored in the context of one another to get full stack visibility of the Java application.