The purpose of this article is to give you the details of our 100 node cluster demo. This demo is recorded and you can watch the 5 minute screencast Hazelcast is an open source clustering and highly scalable data distribution platform for Java. JVMs that are running Hazelcast will dynamically cluster and allow you to easily share and partition your application data across the cluster. Hazelcast is a peer-to-peer solution (there is no master node, every node is a peer) so there is no single point of failure. Communication among cluster members is always TCP/IP with Java NIO beauty. The default configuration comes with 1 backup so if a node fails, no data will be lost (you can specify the backup count). It is as simple as using java.util.{Map, Queue, Set, List}. Just add the hazelcast.jar into your classpath and start coding. When you download the Hazelcast, you will find a test.sh under bin directory. The test.sh runs an application which randomly makes 40% get, 40% put and 20% remove on a distributed map. In this demo the same test application will be used to see how it performs on 100 node cluster. Amazon EC2 and S3 An easy to use and scalable cloud environment was needed for demo so we decided to use Amazon EC2 for server instances (nodes) and S3 service to store demo application zip and configuration files. With its newly announced Java SDK, it is very simple to start/stop server instances and upload files to S3 programatically. Hazelcast AMI & Launcher The challenge here is that we are running an application on 100 nodes and dealing with each and every server in the cluster is a huge task. We don't want to ssh into every server and manually start the application. This part is automated by creating a special server image (AMI). The AMI contains Java Runtime and a launcher application we developed, which will download the demo application from Amazon S3, unzip it, and run the hazelcast/bin/test.sh in it. The Launcher is actually so generic that it can run any application; it doesn't care/know what test.sh contains. Deployer Deployment of the demo application is also automated so that we don't need to login into AWS Management Console and manually start instances. Deployer instantiates any number of Amazon EC2 servers with any AMI and also uploads the demo application zip file to S3. So the idea here is that, the Deployer will store the application into S3 and launch 100 EC2 instances with our image. The Launcher on each instance will download the application from S3 and run it. Demo Details. The smallest EC2 instances (m1.small) are used to run the demo. These are the virtual instances with CPU about 1.0 GHz. Also keep in mind that EC2 platform suffers from considerable amount of network latency. That's why we increased the thread count to 250 in our application. The following steps performed during the demo Download hazelcast-1.8.3.zip from www.hazelcast.com. Unzip the file and move the monitoring war file into tomcat6/webapps directory. Edit the test.sh under the bin directory: Add -Xmx1G -Xms1G Add -Dhazelcast.initial.wait.seconds=100 to make the cluster evenly partition on start so that migration can be avoided for better performance. Add t250 as an argument to the application to set thread count to 250. Remember the latency issue. Run the Deployer from IDE. Check from EC2 Management Console if 100 servers started. Start tomcat. Copy the public DNS name of one of the servers to connect to from monitoring tool. Go to http://localhost:8080/hazelcast-monitor-1.8.3/ (Hazelcast Monitoring Tool). Paste the address and connect to the cluster. Enjoy! Results You should always look for programatic ways of launching applications on the cloud. With these tools we were able to deploy and run the demo application on 100 servers in minutes. The entire Hazelcast cluster was making over 400,000 operations per second on the smallest EC2 instances. In our next demo we will experiment Hazelcast on large data set and even bigger cluster. Watch the screencast
The industry is recognizing the fact that performance testing & engineering should be part of the project execution road map starting from the requirements gathering phase. At many times during project executions, performance engineering related activities are executed based on customer need or slow response time of application after development phase gets completed. Glassbox can be leveraged (by developers/testers/business users) during and after the development cycle to monitor the response times of requests with-out being aware of underlying application structure and code details. Analysis generated by Glassbox gives direct pointers on where is the bottleneck which causes slow response time for that particular request/page/URL. About Glassbox Glassbox is an open source web application which aid in performance monitoring and troubleshooting of multiple web applications deployed in container. Troubleshooting It contains the built-in knowledge repository of common problems which are used to pinpoint the issues and suggestions on causes as Java code executes. Performance Monitoring It monitors the requests as Java code executes and provides details about response times. Glassbox web client (AJAX GUI) provides nice summary dashboard view which contains various attributes like (server-name, application name, operation/request-URL, average time, no. of executions, status (slow / OK) and analysis details). By default, an operation that takes more than 1 sec execution time is marked as SLOW status. Such SLA can be modified using Glassbox properties file. Analysis part describes the problem precisely and very clearly in plain English words, rather than displaying large code/exception trace. This definitely increases developer productivity by reducing developer’s time spent in log files and using IDE debuggers. Internals The two main components of Glassbox are Monitor and Agent. Monitor uses Aspect-Oriented Programming (AOP) to monitor the JVM activity. Agent diagnoses and presents the monitoring results and uses knowledge repository to cross reference the problem with suggestions/solutions. Glassbox agent supports viewing of the analysis results using JMX (eg. Java 5 JConsole) Consoles. Glassbox extensively uses the AOP approach internally to monitor the Java code. This gives the benefit of not making any changes to source code or build-process and hence can work with any legacy web application/jar file as well. Technologies Glassbox should work on any application server that supports Servlet 2.3 or later. The servers where Glassbox is tested and installation process is automated are Apache Tomcat, weblogic, websphere, Resin, Oracle OC4J, websphere, Resin, Jetty & GlassFish. Overhead Having Glassbox application running on same container would generate a performance overhead. Typically this would affect the response time and memory overhead. Hence it is recommended to start the Glassbox application only when it’s required for performance monitoring. Licensing Glassbox is an open source project, it is free to download and run. Glassbox uses the GNU Lesser General Public License to distribute software and documentation. Demo Application Development & Deployment to Tomcat To test the capabilities of Glassbox, a sample application is developed which has a TestServlet class. This servlet calls DelayGenerator class’s generateDelay() method. This method calls Thread class’s sleep() method which suspends the execution of servlet. A counter is being initialized in DelayGenerator class which determines the time interval till which servlet is needed to be suspended. TestServlet.java /** * File: TestServlet.java * @author Viral Thakkar */ package com.infosys.star.glassbox; import java.io.IOException; import java.io.PrintWriter; import javax.servlet.ServletException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; public class TestServlet extends HttpServlet { protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { DelayGenerator delayObj = new DelayGenerator(); int delay = delayObj.generateDelay(); response.setContentType("text/html"); PrintWriter out = response.getWriter(); out.println(""); out.println(" Hello World from Test Servlet : "+delay+" milliseconds "); out.println(""); out.flush(); } } DelayGenerator.java /** * File: DelayGenerator.java * @author Viral Thakkar */ package com.infosys.star.glassbox; public class DelayGenerator { private static int counter = 1; public int generateDelay() { try { Thread.sleep(counter * 100); counter++; } catch (InterruptedException e) { e.printStackTrace(); } return counter*100; } } Glassbox Installation & Integration to Apache Tomcat 6.0 Glassbox installation is very straightforward for non-clustered environment for the server where it’s automated. Simply drop the glassbox.war file at the appropriate folder inside server folder or perform the server specific steps/configuration to deploy the war file. Browse to server url with context root as glassbox – http://<>:/glassbox. Follow the instructions available on this page. According to specific server, this page would suggest the configuration changes for a server. Please refer to Glassbox User Guide document for details on how to install Glassbox for clustered application server environment. For Apache Tomcat 6.0- Add following command line arguments to Tomcat’s Java options: -Dglassbox.install.dir=C:\Tomcat6.0\lib\glassbox -Djava.rmi.server.useCodebaseOnly=true -javaagent:C:\Tomcat6.0\lib\aspectjweaver.jar Monitoring & Technical Analysis Glassbox web client (URL- http://<>:<>/glassbox ) shows the summary and detailed view of all the requests/operations that container/JVM has executed. Summary Section View Different attributes (columns) which gets displayed in this table are as below - Attribute / Column Name Comments Status This indicates whether operation/request is performing OK, SLOW or FAILING Analysis For SLOW/FAILING status, this value provides the small summary of the cause of the problem. Operation This is name of the operation/request of an application Server Name of the server where monitoring is being done. In a clustered environment, this allows to distinguish operations on different servers. Executions This value indicates how many times this operation has run since the application server was started or Glassbox’s statistics were last reset. Click the request in above summary table to view its detailed analysis in below detailed section. Detailed Section View The details area provides information relating to operations selected in the summary table. Different sub-sections which gets displayed in this view are as below - Sub-section Name Comments Executive Summary High level summary view of the selected operation gets displayed in a table format. This is neat view to senior stake holders who are not interested in technical details. Technical Summary This section contains more technical details in paragraph and table representation formats to provide insight into root cause of the problem if any, like which operation, query is slow and statistics of same. Details like stack trace, thread lock name are provided to find and fix the problem. “Common solutions” sub section shows pointers to resolve the identified problem/s. “Glassbox has ruled out other potential problems” sub section saves time to know what problems have already been ruled out. Executive Summary View Technical Summary -> Technical Details Views Above two snapshots are parts of the Technical Details section and provide minute details at code level with line number so as to pinpoint where the problem is. Here cause is identified at Class com.infosys.star.glassbox.DelayGenerator inside Method generateDelay at line number 12 where Thread.sleep is invoked. Perform Load Testing Using JMeter and Monitor Using Glassbox Apache JMeter is used to test performance both on static and dynamic resources (files, Servlets, Perl scripts, Java Objects, Data Bases and Queries, FTP Servers and more). It can be used to simulate a heavy load on a server, network or object to test its strength or to analyze overall performance under different load types. It can be used to make a graphical analysis of performance or to test server/script/object behavior under heavy concurrent load. Using JMeter, create a test plan that simulates 10 users requesting for 1 page 5 times. i.e. 10 x 1 x 5 = 50 HTTP requests. First step is to add a Thread Group element. The Thread Group tells JMeter the number of users to simulate, how often the users should send requests, and the how many requests they should send. Next step is to add HTTP Request element to added Thread Group. In parallel, have the Glassbox up and running to monitor response time statistics of the load generated by JMeter application. Below is the Executive summary view of above test in Glassbox web UI interface. Section “Monitoring & Technical Analysis” contains the details to understand the Glassbox generated analysis. Conclusion Glassbox is not the replacement for performance testing tool like load runner. Glassbox aids in the project to various stakeholders in finding, conveying and fixing the performance problems at all phases starting build (development) to post deployment. Glassbox application to be started/installed only during monitoring time so as to avoid the performance overhead for other applications due to CPU & memory footprint occupied by Glassbox application on the container. During load testing of the application, Glassbox turns out to be good option to figure out the root causes inside an application code. References Glassbox web site - http://www.glassbox.com/glassbox/Home.html Glassbox User Guide - http://nchc.dl.sourceforge.net/sourceforge/glassbox/Glassboxv2.0UserGuide.pdf Apache JMeter - http://jakarta.apache.org/jmeter/ Download & Support Glassbox Download Link - http://www.glassbox.com/glassbox/Downloads.html Glassbox forum Link - http://sourceforge.net/forum/forum.php?forum_id=575670 About Author Viral Thakkar is a Technical Architect with the Banking and Capital Markets vertical at Infosys. He has 9.5 years of technology consulting experience mainly on Java/JEE technologies and frameworks with large banks and financial institutions across the globe. He has been part of many small and large-scale initiatives related to application development, architecture creation and strategy definition. From http://viralpatel.net/blogs