Reloading Java Classes: Classloaders in Web Development — Tomcat, GlassFish, OSGi, Tapestry 5
In this article we’ll review how dynamic classloaders are used in real servers, containers and frameworks to reload Java classes and applications. We’ll also touch on how to get faster reloads and redeploys by using them in optimal ways. RJC101: Objects, Classes and ClassLoaders RJC201: How do ClassLoader leaks happen? RJC301: Classloaders in Web Development — Tomcat, GlassFish, OSGi, Tapestry 5 and so on RJC401: HotSwap and JRebel — what do they really do? RJC501: The impact of the redeploy phase on the development process AKA:Turnaround Java EE (web) applications In order for a Java EE web application to run, it has to be packaged into an archive with a .WAR extension and deployed to a servlet container like Tomcat. This makes sense in production, as it gives you a simple way to assemble and deploy the application, but when developing that application you usually just want to edit the application’s files and see the changes in the browser. A Java EE enterprise application has to be packaged into an archive with an .EAR extension and deployed to an application container. It can contain multiple web applications and EJB modules, so it often takes a while to assemble and deploy it. Recently, 1100+ EE developers told us how much time it takes them, and we compiled the results into the Redeploy and Restart Report. Spoiler: Avg redeploy & restart time is 2.5 minutes – which is higher than we expected. In Reloading Java Classes 101, we examined how dynamic classloaders can be used to reload Java classes and applications. In this article we will take a look at how servers and frameworks use dynamic classloaders to speed up the development cycle. We’ll use Apache Tomcat as the primary example and comment when behavior differs in other containers (Tomcat is also directly relevant for JBoss and GlassFish as these containers embed Tomcat as the servlet container). Redeployment To make use of dynamic classloaders we must first create them. When deploying your application, the server will create one classloader for each application (and each application module in the case of an enterprise application). The classloaders form a hierarchy as illustrated: In Tomcat each .WAR application is managed by an instance of the StandardContext class that creates an instance of WebappClassLoader used to load the web application classes. When a user presses “reload” in the Tomcat Manager the following will happen: StandardContext.reload() method is called The previous WebappClassLoader instance is replaced with a new one All reference to servlets are dropped New servlets are created Servlet.init() is called on them Calling Servlet.init() recreates the “initialized” application state with the updated classes loaded using the new classloader instance. The main problem with this approach is that to recreate the “initialized” state we run the initialization from scratch, which usually includes loading and processing metadata/configuration, warming up caches, running all kinds of checks and so on. In a sufficiently large application this can take many minutes, but in a in small application this often takes just a few seconds and is fast enough to seem instant, as commonly demonstrated in the Glassfish v3 promotional demos. If your application is deployed as an .EAR archive, many servers allow you to also redeploy each application module separately, when it is updated. This saves you the time you would otherwise spend waiting for non-updated modules to reinitialize after the redeployment. Hot Deployment Web containers commonly have a special directory (e.g. “webapps” in Tomcat, “deploy” in JBoss) that is periodically scanned for new web applications or changes to the existing ones. When the scanner detects that a deployed .WAR is updated, the scanner causes a redeploy to happen (in Tomcat it calls the StandardContext.reload() method). Since this happens without any additional action on the user’s side it is commonly referred to “Hot Deployment”. Hot Deployment is supported by all wide-spread application servers under different names: autodeployment, rapid deployment, autopublishing, hot reload, and so on. In some containers, instead of moving the archive to a predefined directory you can configure the server to monitor the archive at a specific path. Often the redeployment can be triggered from the IDE (e.g. when the user saves a file) thus reloading the application without any additional user involvement. Although the application is reloaded transparently to the user, it still takes the same amount of time as when hitting the “Reload” button in the admin console, so code changes are not immediately visible in the browser, for example. Another problem with redeployment in general and hot deployment in particular is classloader leaks. As we reviewed in Reloading Java Classes 201, it is amazingly easy to leak a classloader and quickly run out of heap causing an OutOfMemoryError. As each deployment creates new classloaders, it is common to run out of memory in just a few redeploys on a large enough application (whether in development or in production). Exploded Deployment An additional feature supported by the majority of web containers is the so called “exploded deployment”, also known as “unpackaged” or “directory” deployment. Instead of deploying a .WAR archive, one can deploy a directory with exactly the same layout as the .WAR archive: Why bother? Well, packaging an archive is an expensive operation, so deploying the directory can save quite a bit of time during build. Moreover, it is often possible to set up the project directory with exactly the same layout as the .WAR archive. This means an added benefit of editing files in place, instead of copying them to the server. Unfortunately, as Java classes cannot be reloaded without a redeploy, changing a .java file still means waiting for the application to reinitialize. With some servers it makes sense to find out exactly what triggers the hot redeploy in the exploded directory. Sometimes the redeploy will be triggered only when the “web.xml” timestamp changes, or as in the case of GlassFish only when a special ”.reload” file timestamp changes. In most servers any change to deployment descriptors or compiled classes will cause a hot redeploy. If your server only supports deploying by copying to a special directory (e.g. Tomcat “webapps”, JBoss “deploy” directories) you can skip the copying by creating a symlink from that special directory to your project workspace. On Linux and Mac OS X you can use the common “ln -s” command to do that, whereas on Windows you should download the Sysinternals “junction” utility. If you use Maven, then it’s quite complicated to set up exploded development from your workspace. If you have a solo web application you can use the Maven Jetty plugin, which uses classes and resources directly from Maven source and target project directories. Unfortunately, the Maven Jetty plugin does not support deploying multiple web applications, EJB modules or EARs so in the latter case you’re stuck doing artifact builds. Session Persistence Since we’re on the topic of reloading classes, and redeploying involves reinitializing an application, it makes sense to talk about session state. An HTTP session usually holds information like login credentials and conversational state. Losing that session when developing a web application means spending time logging in and browsing to the changes page – something that most web containers have tried to solve by serializing all of the objects in the HttpSession map and then deserializing them in the new classloader. Essentially, they copy all of the session state. This requires that all session attributes implement Serializable (ensuring session attributes can be written to a database or a file for later use), which is not restricting in most cases. Session persistence has been present in most major containers for many years (e.g. Restart Persistence in Tomcat), but was notoriously absent in Glassfish before v3. OSGi There is a lot of misunderstanding surrounding what exactly OSGi does and doesn’t do. If we ignore the aspects irrelevant to the current issue, OSGi is basically a collection of modules each wrapped in its own classloader, which can be dropped and recreated at will. When it’s recreated, the modules are reinitialized exactly the same way a web application is. The difference between OSGi and a web container is that OSGi is something that is exposed to your application, that you use to split your application into arbitrarily small modules. Therefore, by design, these modules will likely be much smaller than the monolithic web applications we are used to building. And since each of these modules is smaller and we can “redeploy” them one-by-one, re-initialization takes less time. The time depends on how you design your application (and can still be significant). Tapestry 5, RIFE & Grails Recently, some web frameworks, such as Tapestry 5, RIFE and Grails, have taken a different approach, taking advantage of the fact that they already need to maintain application state. They’ll ensure that state will be serializable, or otherwise easily re-creatable, so that after dropping a classloader, there is no need to reinitialize anything. This means that application developers use frameworks’ components and the lifecycle of those components is handled by the framework. The framework will initialize (based on some configuration, either xml or annotation based), run and destroy the components. As the lifecycle of the components is managed by the framework, it is easy to recreate a component in a new classloader without user intervention and thus create the effect of reloading code. In the background, the old component is destroyed (classloader is dropped) and a new one created (in a new classloader where the classes are read in again) and the old state is either deserialized or created based on the configuration. This has the obvious advantage of being very quick, as components are small and the classloaders are granular. Therefore the code is reloaded instantly, giving a smooth experience in developing the application. However such an approach is not always possible as it requires the component to be completely managed by the framework. It also leads to incompatibilities between the different class versions causing, among others, ClassCastExceptions. We’ve Covered a Lot – and simplified along the way It’s worth mentioning that using classloaders for code reloading really isn’t as smooth as we have described here – this is an introductory article series. Especially with the more granular approaches (such as frameworks that have per component classloaders, manual classloader dropping and recreating, etc), when you start getting a mixture of older and newer classes all hell can break loose. You can hold all kinds of references to old objects and classes, which will conflict with the newly loaded ones (a common problem is getting a ClassCastException), so watch what you’re doing along the way. As a side note: Groovy is actually somewhat better at handling this, as all calls through the Meta-Object Protocol are not subject to such problems. This article addressed the following questions: How are dynamic classloaders used to reload Java classes and applications? How do Tomcat, GlassFish (incl v3), and other servers reload Java classes and applications? How does OSGi improve reload and redeploy times? How do frameworks (incl Tapestry 5, RIFE, Grails) reload Java classes and applications? Coming up next, we continue our explanation of classloaders and the redeploy process with an investigation into HotSwap and JRebel, two tools used to reduce time spent reloading and redeploying. Stay tuned! From http://www.zeroturnaround.com/blog/
January 23, 2010
by Dave Booth
·
25,367 Views