Performance Tuning Java Applications in Linux
Tips and tricks for tuning your Java applications using Linux. Including things like heap configuration and garbage collection.
Join the DZone community and get the full member experience.
Join For FreeYou may also like: How to Properly Plan JVM Performance Tuning
While Performance Tuning an application both Code and Hardware running the code should be accounted for. In this blog post, we shall go over various aspects that have to be taken care of to extract maximum performance out of a Java Application running on Linux.
Thread Contention
- Reduce the amount of code in critical sections.
- Prefer synchronized blocks over synchronized methods.
- Prefer locks over synchronized blocks.
- Keep a tab on the order in which you lock resources. You could end up in a deadlock
- Segregate low concurrency, medium concurrency, and high concurrency use cases. Treat them separately.
- Use Compare-And-Swap operations for low concurrency and medium concurrency settings when possible.
- Do not call third party web services or execute other long-running methods in the synchronized blocks or under locks.
- Prefer higher-level thread constructs like locks, CountDownLatches, CyclicBarriers when compared to wait, notify and notifyAll mechanisms.
- Make your long-running threads interruptible. From within a thread periodically check for “interrupted” condition.
- Resort to ReadWriteLocks when the number of readers is much greater than the number of writers. It improves read concurrency.
- Use concurrent data structures only in situations where you have multiple threads accessing the data structure.
- Don't spray mutable state over the entire application. Restrict mutable state and deal consciously with it. This reduces the number of locks and various thread contention issues.
- Lookout for spinlocks like the below. They will hog your CPU. while(true) {//do something without sleeping}
- Always introduce a reasonable amount of sleep in such infinite while loops. Polling threads are an example where you might want to do this.
Heap Configuration and Garbage Collection
- Follow Standard practices of using StringBuilder for log messages.
- Profile your application for appropriate heap and garbage collection settings.
- Bad GC settings can hog your CPU, resulting in long application pauses, crash your application with an Out of Memory or freeze your application with Concurrent Mode Failures.
- For low latency, applications use Concurrent Mark and Sweep Algorithm — CMS or G1 GC.
Avoid Swapping to Disk
Ensure there is enough RAM to hold your java process. Swapping java process to disk is a performance killer. If your application is swapped to disk Garbage Collection cycles would be much longer as the objects have to be read from disk during GC.
Disk
- Use async loggers for logging.
- Employ context logging where applicable. Context logging is storing the logs/events of an API call in a context object and logging the context data once via async loggers. This would dramatically improve the performance of the java application and makes it easier to troubleshoot.
- Prefer Databases to local disk writes for any kind of persistence. Databases are optimized for disk reads and write.
CPU
- Use thread pools. Though threads are lightweight it is still costly to create them.
- Indiscriminate spawning of threads would lead to shooting up in context switches and load average which after a threshold makes the application and machine unresponsive.
- Along with CPU usage, lookout for Load Averages and Context Switches. They provide you the complete story about performance.
Keep an Eye on the Network
- Lookout for packet drops on your network.
- A packet drop can mean your application is too busy to receive the packet or your network is congested.
- Tune your network buffer sizes if need be.
Use Caching
- Cache frequently used data to reduce DB hits.
- Tune your Cache Configuration (eviction, expiry, size, consistency, concurrency) for each use case separately.
- Separate read-intensive caches from writing-intensive caches for better locking.
- Use distributed caching judiciously. Inappropriate use of Distributed caching introduces new problems instead of solving existing ones.
Communication
- For Near, Real-Time communication prefer asynchronous communication.
- Have timeouts for outgoing API/RMI calls set so that one long-running operation does not bring down the entire ecosystem.
Profile and Document the Changes
Always profile and document your changes. Document in a lucid manner why a particular decision is being made. Significant decisions today may lose their relevance tomorrow. When this happens future engineers on your application should have enough information to revert or extend them. This is how applications survive. Nobody wants to sit on a ticking time bomb not knowing when it would explode.
Profiling Tools
- Jstack for collecting thread dumps.
- Jmap for collecting heap dumps and live object counts.
- Jvisualvm/Jconsole — analyze heap, threads, collect thread dumps - available by default in JDK.
- Eclipse Memory Analyzer (MAT) — for analyzing the heap dumps.
- Thread Dump Analyzer (TDA) — for analyzing thread dumps, detecting long-running threads and deadlocks.
Advice on Performance Tuning
Plan and budget for performance tuning early in the project but do not prematurely rush into it. This might make a mess out of your core business logic. One of the applications I know was eagerly optimized to restrict the number of records fetched from the DB and this was not documented.
There were all sorts of issues and this piece of code was buried so deep that it took a considerable amount of time to figure this out. Of course, if you have encountered and solved the same business problem multiple times this advice does not apply.
Further Reading
Opinions expressed by DZone contributors are their own.
Comments