Designing High Performance Java / J2EE Applications is not Easy!

Business applications developed in Java have become incredibly complex. Java developers have to have expertise with numerous technologies – JSPs, Servlets, EJBs, Struts, Hibernate, JDBC, JMX, JMS, JSF, Web services, SOAP, thread pools, object pools, etc., not to forget the core Java principles like synchronization, multi-threading, caching, etc. Malfunctioning of any of these technologies can result in slow-downs, application freezes, and errors with key business applications.

Anatomy of a Java developer

One of the articles I was reading last week, i saw a very interesting table that highlighted the different types of failures commonly seen in J2EE applications. Below is an adaptation of this table listing common J2EE problems and their causes. This table gives a very good idea of why designing high performance Java/J2EE applications requires a lot of expertise (and of course, you need to have the right tools handy for you to be able to troubleshoot such applications rapidly, with minimal effort).

Bad Coding; Infinite Loop Threads become stuck in while(true) statements and the like. This comes in CPU-bound and wait-bound/spin-wait variants. Foreseeable lockup You’ll need to perform an invasive loop-ectomy.
Bad Coding: CPU-bound Component This is the common cold of the J2EE world. One bit of bad code or a bad interaction between bits of code hogs the CPU and slows throughput to a crawl. Consistent slownessSlower and slower under load The typical big win is a cache of data or of performed calculations.
The Unending Retry This involves continual (or continuous in extreme cases) retries of a failed request Foreseeable backupSudden chaos It might just be that a back-end system is completely down. Availability monitoring can help there, or simply diferentiating attempts from successes.
Threading: Chokepoint Threads back up on an over-ambitious synchronization point, creating a traffice jam. Slower and slower under loadSporadic hangs or aberrant errorsForeseeable lockup

Sudden Chaos

Perhaps the synchronization is unnecessary (with a simple redesign) or perhaps more exotic locking strategies (e.g., reader/writer locks) may help.
Threading: Deadlock / Livelock Most commonly, its your basic order-of-acquisition problem. Sudden chaos Treatment options include detecting if locking is really necessary, using a master lock, deterministic order-of-acquisition, and the banker’s algorithm.
Over-Usage of External Systems The J2EE application abuses a backend system with requests that are too large or too numerous. Consistent slownessSlower and slower under load Eliminate redundant work requests, batch similar work requests, break up large requests into several smaller ones, tune work requests or back-end system (e.g., indexes for common query keys), etc.
External bottleneck A back end or other external system (e.g., authentication) slows down, slowing the J2EE app server and its applications as well. Consistent slownessSlower and slower under load Consult a specialist (responsible third party or system administrator) for treatment of said external bottleneck.
Layer-itis A poorly implemented bridge layer (JDBC driver, CORBA link to legacy system) slows all traffic through it to a crawl with constant marshalling and unmarshalling of data and requests. The disease is easily confused with External Bottleneck in its early stages. Consistent slownessSlower and slower under load Check version compatibility of bridge layer and external system. Evaluate different bridge vendors if available. Re-architecture may be necessary to bypass the layer altogether.
Internal Resource Bottleneck: Over-Usage or Under Allocation Internal resources (threads, pooled objects) become scarce. Is over-utilization occurring in a healthy manner under load or is it because of a leak? Slower and slower under loadSporadic hangs or aberrant errors Under-allocation: increase the maximum pool size based on highest expected load. Over-usage: see Over-Usage of External System.
Linear Memory Leak A per-unit (per-transaction, per-user, etc.) leak causes memory to grow linearly with time or load. This degrades system performance over time or under load. Recovery is only possible with a restart. Slower and slower over timeSlower and slower under load This is most typically linked with a resource leak, though many exotic strains exist (for example, linked-list storage of per-unit data or a recycling/growing buffer that doesn’t recycle)
Exponential Memory Leak A leak with a doubling growth strategy causes an exponential curve in the system’s memory consumption over time. Slower and slower over timeSlower and slower under load This is typically caused by adding elements to a collection (Vector, HashMap) that are never removed.
Resource Leak JDBC statements, CICS transaction gateway connections, and the like are leaked, causing pain for both the Java bridge layer and the backend system. Slower and slower over timeForeseeable lockupSudden chaos Typically, this is caused by missing finally block or a more simple failure to close objects that represent external resources.

In-depth visibility into the Java virtual machineAs you can see from the above table, monitoring a J2EE application end-to-end requires:

  • Tracking key metrics specific to the application server in use (e.g., WebLogic, WebSphere, JBoss, Tomcat, etc.)
  • Monitoring of the external dependencies of the Java application tier – e.g., databases, Active Directory, messaging servers, networks, etc.
  • Finally, all of the metrics have to be correlated together – based on time, and based on inter-dependencies between applications in the infrastructure, so that when a problem occurs, administrators are equipped to quickly determine what is causing the problem – i.e., network? database? application? web?

Below are several relevant links about how eG Enterprise helps with end-to-end monitoring, diagnosis, and reporting for J2EE applications.

Also of interest is this on-line webinar titled “Managing N-Tiers without Tears”. Click here to view this webinar >>>