Language Resources

The Latest Languages Topics

Implementing Filter and Bakery Locks in Java

In order to understand how locks work, implementing custom locks is a good way. This post will show how to implement Filter and Bakery locks at Java (which are spin locks) and will compare their performances with Java's ReentrantLock. Filter and Bakery locks satisfies mutual exclusion and are starvation free algorithms also, Bakery lock is a first-come-first-served lock [1]. For performance testing, a counter value is incremented up to 10000000 with different lock types, different number of threads and different number of times. Test system configuration is: Intel Core I7 (has 8 cores – 4 of them are real), Ubuntu 14.04 LTS and Java 1.7.0_60. Filter lock has n-1 levels which maybe considered as “waiting rooms”. A thread must traverse this waiting rooms before acquiring the lock. There are two important properties for levels [2]: 1) At least one thread trying to enter level l succeeds. 2) If more than one thread is trying to enter level l, then at least one is blocked (i.e., continues to wait at that level). Filter lock is implemented as follows: /** * @author Furkan KAMACI */ public class Filter extends AbstractDummyLock implements Lock { /* Due to Java Memory Model, int[] not used for level and victim variables. Java programming language does not guarantee linearizability, or even sequential consistency, when reading or writing fields of shared objects [The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.61.] */ private AtomicInteger[] level; private AtomicInteger[] victim; private int n; /** * Constructor for Filter lock * * @param n thread count */ public Filter(int n) { this.n = n; level = new AtomicInteger[n]; victim = new AtomicInteger[n]; for (int i = 0; i < n; i++) { level[i] = new AtomicInteger(); victim[i] = new AtomicInteger(); } } /** * Acquires the lock. */ @Override public void lock() { int me = ConcurrencyUtils.getCurrentThreadId(); for (int i = 1; i < n; i++) { level[me].set(i); victim[i].set(me); for (int k = 0; k < n; k++) { while ((k != me) && (level[k].get() >= i && victim[i].get() == me)) { //spin wait } } } } /** * Releases the lock. */ @Override public void unlock() { int me = ConcurrencyUtils.getCurrentThreadId(); level[me].set(0); } } Bakery lock algorithm maintains the first-come-first-served property by using a distributed version of the number-dispensing machines often found in bakeries: each thread takes a number in the doorway, and then waits until no thread with an earlier number is trying to enter it [3]. Bakery lock is implemented as follows: /** * @author Furkan KAMACI */ public class Bakery extends AbstractDummyLock implements Lock { /* Due to Java Memory Model, int[] not used for level and victim variables. Java programming language does not guarantee linearizability, or even sequential consistency, when reading or writing fields of shared objects [The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.61.] */ private AtomicBoolean[] flag; private AtomicInteger[] label; private int n; /** * Constructor for Bakery lock * * @param n thread count */ public Bakery(int n) { this.n = n; flag = new AtomicBoolean[n]; label = new AtomicInteger[n]; for (int i = 0; i < n; i++) { flag[i] = new AtomicBoolean(); label[i] = new AtomicInteger(); } } /** * Acquires the lock. */ @Override public void lock() { int i = ConcurrencyUtils.getCurrentThreadId(); flag[i].set(true); label[i].set(findMaximumElement(label) + 1); for (int k = 0; k < n; k++) { while ((k != i) && flag[k].get() && ((label[k].get() < label[i].get()) || ((label[k].get() == label[i].get()) && k < i))) { //spin wait } } } /** * Releases the lock. */ @Override public void unlock() { flag[ConcurrencyUtils.getCurrentThreadId()].set(false); } /** * Finds maximum element within and {@link java.util.concurrent.atomic.AtomicInteger} array * * @param elementArray element array * @return maximum element */ private int findMaximumElement(AtomicInteger[] elementArray) { int maxValue = Integer.MIN_VALUE; for (AtomicInteger element : elementArray) { if (element.get() > maxValue) { maxValue = element.get(); } } return maxValue; } } For such kind of algorithms, it should be provided or used a thread id system which starts from 0 or 1 and increments one by one. Threads' names set appropriately for that purpose. It should also be considererd that: Java programming language does not guarantee linearizability, or even sequential consistency, when reading or writing fields of shared objects [4]. So, level and victim variables for Filter lock, flag and label variables for Bakery lock defined as atomic variables. For one, who wants to test effects of Java Memory Model can change that variables into int[] and boolean[] and run algorithm with more than 2 threads. Than, can see that algorithm will hang for either Filter or Bakery even threads are alive. To test algorithm performances, a custom counter class implemented which has a getAndIncrement method as follows: /** * gets and increments value up to a maximum number * * @return value before increment if it didn't exceed a defined maximum number. Otherwise returns maximum number. */ public long getAndIncrement() { long temp; lock.lock(); try { if (value >= maxNumber) { return value; } temp = value; value = temp + 1; } finally { lock.unlock(); } return temp; } There is a maximum number barrier to fairly test multiple application configurations. Consideration is that: there is a piece amount of work (incrementing a variable up to a desired number) and with different number of threads how fast you can finish it. So, for comparison, there should be a “job” equality. This approach also tests unnecessary work load with that piece of code: if (value >= maxNumber) { return value; } for multiple threads when it is compared an approach that calculating unit work performance of threads (i.e. does not putting a maximum barrier, iterating in a loop up to a maximum number and than dividing last value to thread number). This configuration used for performance comparison: Threads 1,2,3,4,5,6,7,8 Retry Count 20 Maximum Number 10000000 This is the chart of results which includes standard errors: First of all, when you run a block of code within Java several time, there is an internal optimization for codes. When algorithm is run multiple times and first output compared to second output this optimization's effect can be seen. First elapsed time mostly should be greater than second line because of that. For example: currentTry = 0, threadCount = 1, maxNumber = 10000000, lockType = FILTER, elapsedTime = 500 (ms) currentTry = 1, threadCount = 1, maxNumber = 10000000, lockType = FILTER, elapsedTime = 433 (ms) Conclusion: From the chart, it can bee seen that Bakery lock is faster than Filter Lock with a low standard error. Reason is Filter Lock's lock method. At Bakery Lock, as a faired approach threads runs one by one but at Filter Lock they computes with each other. Java's ReentrantLock has best when compared to others. On the other hand Filter Lock gets worse linearly but Bakery and ReentrantLock are not (Filter lock may have a linear graphic when it run with much more threads). More thread count does not mean less elapsed time. 2 threads maybe worse than 1 thread because of thread creating and locking/unlocking. When thread count starts to increase, elapsed time gets better for Bakery and ReentrantLock. However when thread count keep going to increase than it gets worse. Reason is real core number of the test computer which runs algorithms. Source code for implementing filter and bakery locks in Java can be downloaded from here: https://github.com/kamaci/filbak [1] The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.31.-33. [2] The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.28. [3] The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.31. [4] The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.61.

April 28, 2015

by Furkan Kamaci

· 8,708 Views · 2 Likes

uniVocity-parsers: A powerful CSV/TSV/Fixed-width file parser library for Java

uniVocity-parsers is an open-source project CSV/TSV/Fixed-width file parser library in Java, providing many capabilities to read/write files with simplified API, and powerful features as shown below. Unlike other libraries out there, uniVocity-parsers built its own architecture for parsing text files, which focuses on maximum performance and flexibility while making it easy to extend and build new parsers. Contents Overview Installation Features Overview Reading CSV/TSV/Fixed-width Files Writing CSV/TSV/Fixed-width Files Performance and Flexibility Design and Implementations 1. Overview I'm a Java developer working on a web-based system to evaluate telecommunication carriers' network and work out reports. In the system, the CSV format was heavily involved for the network-related data, such as real-time network status (online/offline) for the broadband subscribers, and real-time traffic for each subscriber. Generally the size of a single CSV file would exceed 1GB, with millions of rows included. And we were using the library JavaCSV as the CSV file parser. As growth in the capacity of carriers' network and the time duration our system monitors, the size of data in CSV increased so much. My team and I have to work out a solution to achieve better performance (even in seconds) in CSV files processing, and better extendability to provide much more customized functionality. We came across this library uniVocity-parsers as a final solution after a lot of testing and analysis, and we found it great. In addition of better performance and extendability, the library provides developers with simplified APIs, detailed documents & tutorials and commercial support for highly customized functionality. This project is hosted at Github with 62 stars & 8 forks (at the time of writing). Tremendous documents & tutorials are provided at here and here. You can find more examples and news here as well. In addition, the well-known open-source project Apache Camel integrates uniVocity-parsers for reading and writing CSV/TSV/Fixed-width files. Find more details here. 2. Installation I'm using version 1.5.1 , but refer to the official download page to see if there's a more recent version available. The project is also available in the maven central repository, so you can add this to your pom.xml: com.univocity univocity-parsers 1.5.1 3. Features Overview uniVocity-parsers provides a list of powerful features, which can fulfill all requirements you might have for processing tabular presentations of data. Check the following overview chart for the features: 4. Reading Tabular Presentations Data Read all rows of a csv CsvParser parser = new CsvParser(new CsvParserSettings()); List allRows = parser.parseAll(getReader("/examples/example.csv")); For full list of demos in reading features, refer to: https://github.com/uniVocity/univocity-parsers#reading-csv 5. Writing Tabular Presentations Data Write data in CSV format with just 2 lines of code: List rows = someMethodToCreateRows(); CsvWriter writer = new CsvWriter(outputWriter, new CsvWriterSettings()); writer.writeRowsAndClose(rows); For full list of demos in writing features, refer to: https://github.com/uniVocity/univocity-parsers/blob/master/README.md#writing 6. Performance and Flexibility Here is the performance comparison we tested for uniVocity-parsers and JavaCSV in our system: File size Duration for JavaCSV parsing Duration for uniVocity-parsers parsing 10MB, 145453 rows 1138ms 836ms 100MB, 809008 rows 23s 6s 434MB, 4499959 rows 91s 28s 1GB, 23803502 rows 245s 70s Here are some performance comparison tables for almost all CSV parsers libraries in existence. And you can find that uniVocity-parsers got significantly ahead of other libraries in performance. uniVocity-parsers achieved its purpose in performance and flexibility with the following mechanisms: Read input on separate thread (enable by invoking CsvParserSettings.setReadInputOnSeparateThread()) Concurrent row processor (refer to ConcurrentRowProcessor which implements RowProcessor) Extend ColumnProcessor to process columns with your own business logic Extend RowProcessor to read rows with your own business logic 7. Design and Implementations A bunch of processors in uniVocity-parsers are core modules, which are responsible for reading/writing data in rows and columns, and execute data conversions. Here is the diagram of processors: You can create your own processors easily by implementing the RowProcessor interface or extending the provided implementations. In the following example I simply used an anonymous class: CsvParserSettings settings = new CsvParserSettings(); settings.setRowProcessor(new RowProcessor() { /** * initialize whatever you need before processing the first row, with your own business logic **/ @Override public void processStarted(ParsingContext context) { System.out.println("Started to process rows of data."); } /** * process the row with your own business logic **/ StringBuilder stringBuilder = new StringBuilder(); @Override public void rowProcessed(String[] row, ParsingContext context) { System.out.println("The row in line #" + context.currentLine() + ": "); for (String col : row) { stringBuilder.append(col).append("\t"); } } /** * After all rows were processed, perform any cleanup you need **/ @Override public void processEnded(ParsingContext context) { System.out.println("Finished processing rows of data."); System.out.println(stringBuilder); } }); CsvParser parser = new CsvParser(settings); List allRows = parser.parseAll(new FileReader("/myFile.csv")); The library offers a whole lot more features. I recommend you to have a look as it really made a difference in our project.

April 27, 2015

by Jerry Joe

· 8,302 Views

Diagnosing SST Errors with Percona XtraDB Cluster for MySQL

[This article was written by Stephane Combaudon] State Snapshot Transfer (SST) is used in Percona XtraDB Cluster (PXC) when a new node joins the cluster or to resync a failed node if Incremental State Transfer (IST) is no longer available. SST is triggered automatically but there is no magic: If it is not configured properly, it will not work and new nodes will never be able to join the cluster. Let’s have a look at a few classic issues. Port for SST is not open The donor and the joiner communicate on port 4444, and if the port is closed on one side, SST will always fail. You will see in the error log of the donor that SST is started: [...] 141223 16:08:48 [Note] WSREP: Node 2 (node1) requested state transfer from '*any*'. Selected 0 (node3)(SYNCED) as donor. 141223 16:08:48 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 6) 141223 16:08:48 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 141223 16:08:48 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.234.101:4444/xtrabackup_sst' --auth 'sstuser:s3cret' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '04c085a1-89ca-11e4-b1b6-6b692803109b:6'' [...] But then nothing happens, and some time later you will see a bunch of errors: [...] 2014/12/23 16:09:52 socat[2965] E connect(3, AF=2 192.168.234.101:4444, 16): Connection timed out WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 0 1 (20141223 16:09:52.057) WSREP_SST: [ERROR] Cleanup after exit with status:32 (20141223 16:09:52.064) WSREP_SST: [INFO] Cleaning up temporary directories (20141223 16:09:52.068) 141223 16:09:52 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.234.101:4444/xtrabackup_sst' --auth 'sstuser:s3cret' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '04c085a1-89ca-11e4-b1b6-6b692803109b:6' [...] On the joiner side, you will see a similar sequence: SST is started, then hangs and is finally aborted: [...] 141223 16:08:48 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 6) 141223 16:08:48 [Note] WSREP: Requesting state transfer: success, donor: 0 141223 16:08:49 [Note] WSREP: (f9560d0d, 'tcp://0.0.0.0:4567') turning message relay requesting off 141223 16:09:52 [Warning] WSREP: 0 (node3): State transfer to 2 (node1) failed: -32 (Broken pipe) 141223 16:09:52 [ERROR] WSREP: gcs/src/gcs_group.cpp:long int gcs_group_handle_join_msg(gcs_group_t*, const gcs_recv_msg_t*)():717: Will never receive state. Need to abort. The solution is of course to make sure that the ports are open on both sides. SST is not correctly configured Sometimes you will see an error like this on the donor: 141223 21:03:15 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.234.102:4444/xtrabackup_sst' --auth 'sstuser:s3cretzzz' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid 'e63f38f2-8ae6-11e4-a383-46557c71f368:0'' [...] WSREP_SST: [ERROR] innobackupex finished with error: 1. Check /var/lib/mysql//innobackup.backup.log (20141223 21:03:26.973) And if you look at innobackup.backup.log: 41223 21:03:26 innobackupex: Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_file=/etc/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock' as 'sstuser' (using password: YES). innobackupex: got a fatal error with the following stacktrace: at /usr//bin/innobackupex line 2995 main::mysql_connect('abort_on_error', 1) called at /usr//bin/innobackupex line 1530 innobackupex: Error: Failed to connect to MySQL server: DBI connect(';mysql_read_default_file=/etc/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock','sstuser',...) failed: Access denied for user 'sstuser'@'localhost' (using password: YES) at /usr//bin/innobackupex line 2979 What happened? The default SST method is xtrabackup-v2 and for it to work, you need to specify a username/password in the my.cnf file: [mysqld] wsrep_sst_auth=sstuser:s3cret And you also need to create the corresponding MySQL user: mysql> GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost' IDENTIFIED BY 's3cret'; So you should check that the user has been correctly created in MySQL and that wsrep_sst_auth is correctly set. Galera versions do not match Here is another set of errors you may see in the error log of the donor: 141223 21:14:27 [Warning] WSREP: unserialize error invalid flags 2: 71 (Protocol error) at gcomm/src/gcomm/datagram.hpp:unserialize():101 141223 21:14:30 [Warning] WSREP: unserialize error invalid flags 2: 71 (Protocol error) at gcomm/src/gcomm/datagram.hpp:unserialize():101 141223 21:14:33 [Warning] WSREP: unserialize error invalid flags 2: 71 (Protocol error) at gcomm/src/gcomm/datagram.hpp:unserialize():101 Here the issue is that you try to connect a node using Galera 2.x and a node running Galera 3.x. This can happen if you try to use a PXC 5.5 node and a PXC 5.6 node. The right solution is probably to understand why you ended up with such inconsistent versions and make sure all nodes are using the same Percona XtraDB Cluster version and Galera version. But if you know what you are doing, you can also instruct the node using Galera 3.x that it will communicate with Galera 2.x nodes by specifying in the my.cnf file: [mysqld] wsrep_provider_options="socket.checksum=1" Conclusion SST errors can have multiple reasons for occurring, and the best way to diagnose the issue is to have a look at the error log of the donor and the joiner. Galera is in general quite verbose so you can follow the progress of SST on both nodes and see where it fails. Then it is mostly about being able to interpret the error messages.

April 27, 2015

by Peter Zaitsev

· 11,431 Views

Inspecting Thread Dumps of Hung Python Processes and Test Runs

Sometimes, moderately complex Python applications with several threads tend to hang on exit. The application refuses to quit and just idles there waiting for something. Often this is because if any of the Python threads are alive when the process tries to exit it will wait any alive thread to terminate, unless Thread.daemon is set to true. In the past, it used to be little painful to figure out which thread and function causes the application to hang, but no longer! Since Python 3.3 CPython interpreter comes with a faulthandler module. faulthandler is a mechanism to tell the Python interpreter to dump the stack trace of every thread upon receiving an external UNIX signal. Here is an example how to figure out why the unit test run, executed with pytest, does not exit cleanly. All tests finish, but the test suite refuses to quit. First we run the tests and set a special environment variable PYTHONFAULTHANDLER telling CPython interpreter to activate the fault handler. This environment variable works regardless how your Python application is started (you run python command, you run a script directly, etc.). PYTHONFAULTHANDLER=true py.test And then the test suite has finished, printing out the last dot… but nothing happens despite our ferocious sipping of coffee. dotdotdotmoredotsthenthenthedotsstopappearing .. How to proceed: Press CTRL-Z to suspend the current active process in UNIX shell. Use the following command to send SIGABRT signal to the suspended process. kill -SIGABRT %1 Voilá – you get the traceback. In this case, it instantly tells SQLAlchemy is waiting for something and most likely the database has deadlocked due to open conflicting transactions. atal Python error: Aborted Thread 0x0000000103538000 (most recent call first): File "/opt/local/Library/Fra% meworks/Python.framework/Versions/3.4/lib/python3.4/socketserver.py", line 154 in _eintr_retry File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socketserver.py", line 236 in serve_forever File "/Users/mikko/code/trees/pyramid_web20/pyramid_web20/tests/functional.py", line 40 in run File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/threading.py", line 921 in _bootstrap_inner File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/threading.py", line 889 in _bootstrap Current thread 0x00007fff75128310 (most recent call first): File "/Users/mikko/code/trees/venv/lib/python3.4/site-packages/SQLAlchemy-1.0.0b5-py3.4-macosx-10.9-x86_64.egg/sqlalchemy/engine/default.py", line 442 in do_execute ... File "/Users/mikko/code/trees/venv/lib/python3.4/site-packages/SQLAlchemy-1.0.0b5-py3.4-macosx-10.9-x86_64.egg/sqlalchemy/sql/schema.py", line 3638 in drop_all File "/Users/mikko/code/trees/pyramid_web20/pyramid_web20/tests/conftest.py", line 124 in teardown ... File "/Users/mikko/code/trees/venv/lib/python3.4/site-packages/_pytest/config.py", line 41 in main File "/Users/mikko/code/trees/venv/bin/py.test", line 9 in

April 27, 2015

by Mikko Ohtamaa

· 12,066 Views

Increasing Slow Query Performance with the Parallel Query Execution

[This article was written by Alexander Rubin] MySQL and Scaling-up (using more powerful hardware) was always a hot topic. Originally MySQL did not scale well with multiple CPUs; there were times when InnoDB performed poorer with more CPU cores than with less CPU cores. MySQL 5.6 can scale significantly better; however there is still 1 big limitation: 1 SQL query will eventually use only 1 CPU core (no parallelism). Here is what I mean by that: let’s say we have a complex query which will need to scan million of rows and may need to create a temporary table; in this case MySQL will not be able to scan the table in multiple threads (even with partitioning) so the single query will not be faster on the more powerful server. On the contrary, a server with more slower CPUs will show worse performance than the server with less (but faster) CPUs. To address this issue we can use a parallel query execution. Vadim wrote about the PHP asynchronous calls for MySQL. Another way to increase the parallelism will be to use “sharding” approach, for example with Shard Query. I’ve decided to test out the parallel (asynchronous) query execution with relatively large table: I’ve used the US Flights Ontime performance database, which was originally used by Vadim in the old post Analyzing air traffic performance. Let’s see how this can help us increase performance of the complex query reports. Parallel Query Example To illustrate the parallel query execution with MySQL I’ve created the following table: CREATE TABLE `ontime` ( `YearD` year(4) NOT NULL, `Quarter` tinyint(4) DEFAULT NULL, `MonthD` tinyint(4) DEFAULT NULL, `DayofMonth` tinyint(4) DEFAULT NULL, `DayOfWeek` tinyint(4) DEFAULT NULL, `FlightDate` date DEFAULT NULL, `UniqueCarrier` char(7) DEFAULT NULL, `AirlineID` int(11) DEFAULT NULL, `Carrier` char(2) DEFAULT NULL, `TailNum` varchar(50) DEFAULT NULL, `FlightNum` varchar(10) DEFAULT NULL, `OriginAirportID` int(11) DEFAULT NULL, `OriginAirportSeqID` int(11) DEFAULT NULL, `OriginCityMarketID` int(11) DEFAULT NULL, `Origin` char(5) DEFAULT NULL, `OriginCityName` varchar(100) DEFAULT NULL, `OriginState` char(2) DEFAULT NULL, `OriginStateFips` varchar(10) DEFAULT NULL, `OriginStateName` varchar(100) DEFAULT NULL, `OriginWac` int(11) DEFAULT NULL, `DestAirportID` int(11) DEFAULT NULL, `DestAirportSeqID` int(11) DEFAULT NULL, `DestCityMarketID` int(11) DEFAULT NULL, `Dest` char(5) DEFAULT NULL, -- ... (removed number of fields) `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`), KEY `YearD` (`YearD`), KEY `Carrier` (`Carrier`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; And loaded 26 years of data into it. The table is 56G with ~152M rows. Software: Percona 5.6.15-63.0. Hardware: Supermicro; X8DTG-D; 48G of RAM; 24xIntel(R) Xeon(R) CPU L5639 @ 2.13GHz, 1xSSD drive (250G) So we have 24 relatively slow CPUs Simple query Now we can run some queries. The first query is very simple: find all flights per year (in the US): select yeard, count(*) from ontime group by yeard As we have the index on YearD, the query will use the index: mysql> explain select yeard, count(*) from ontime group by yeardG *************************** 1. row *************************** id: 1 select_type: SIMPLE table: ontime type: index possible_keys: YearD,comb1 key: YearD key_len: 1 ref: NULL rows: 148046200 Extra: Using index 1 row in set (0.00 sec) The query is simple, however, it will have to scan 150M rows. Here is the results of the query (cached): mysql> select yeard, count(*) from ontime group by yeard; +-------+----------+ | yeard | count(*) | +-------+----------+ | 1988 | 5202096 | | 1989 | 5041200 | | 1990 | 5270893 | | 1991 | 5076925 | | 1992 | 5092157 | | 1993 | 5070501 | | 1994 | 5180048 | | 1995 | 5327435 | | 1996 | 5351983 | | 1997 | 5411843 | | 1998 | 5384721 | | 1999 | 5527884 | | 2000 | 5683047 | | 2001 | 5967780 | | 2002 | 5271359 | | 2003 | 6488540 | | 2004 | 7129270 | | 2005 | 7140596 | | 2006 | 7141922 | | 2007 | 7455458 | | 2008 | 7009726 | | 2009 | 6450285 | | 2010 | 6450117 | | 2011 | 6085281 | | 2012 | 6096762 | | 2013 | 5349447 | +-------+----------+ 26 rows in set (54.10 sec) The query took 54 seconds and utilized only 1 CPU core. However, this query is perfect for running in parallel. We can run 26 parallel queries, each will count its own year. I’ve used the following shell script to run the queries in background: #!/bin/bash date for y in {1988..2013} do sql="select yeard, count(*) from ontime where yeard=$y" mysql -vvv ontime -e "$sql" &>par_sql1/$y.log & done wait date Here are the results: par_sql1/1988.log:1 row in set (3.70 sec) par_sql1/1989.log:1 row in set (4.08 sec) par_sql1/1990.log:1 row in set (4.59 sec) par_sql1/1991.log:1 row in set (4.26 sec) par_sql1/1992.log:1 row in set (4.54 sec) par_sql1/1993.log:1 row in set (2.78 sec) par_sql1/1994.log:1 row in set (3.41 sec) par_sql1/1995.log:1 row in set (4.87 sec) par_sql1/1996.log:1 row in set (4.41 sec) par_sql1/1997.log:1 row in set (3.69 sec) par_sql1/1998.log:1 row in set (3.56 sec) par_sql1/1999.log:1 row in set (4.47 sec) par_sql1/2000.log:1 row in set (4.71 sec) par_sql1/2001.log:1 row in set (4.81 sec) par_sql1/2002.log:1 row in set (4.19 sec) par_sql1/2003.log:1 row in set (4.04 sec) par_sql1/2004.log:1 row in set (5.12 sec) par_sql1/2005.log:1 row in set (5.10 sec) par_sql1/2006.log:1 row in set (4.93 sec) par_sql1/2007.log:1 row in set (5.29 sec) par_sql1/2008.log:1 row in set (5.59 sec) par_sql1/2009.log:1 row in set (4.44 sec) par_sql1/2010.log:1 row in set (4.91 sec) par_sql1/2011.log:1 row in set (5.08 sec) par_sql1/2012.log:1 row in set (4.85 sec) par_sql1/2013.log:1 row in set (4.56 sec) Complex Query Now we can try more complex query. Lets imagine we want to find out which airlines have maximum delays for the flights inside continental US during the business days from 1988 to 2009 (I was trying to come up with the complex query with multiple conditions in the where clause). select min(yeard), max(yeard), Carrier, count(*) as cnt, sum(ArrDelayMinutes>30) as flights_delayed, round(sum(ArrDelayMinutes>30)/count(*),2) as rate FROM ontime WHERE DayOfWeek not in (6,7) and OriginState not in ('AK', 'HI', 'PR', 'VI') and DestState not in ('AK', 'HI', 'PR', 'VI') and flightdate < '2010-01-01' GROUP by carrier HAVING cnt > 100000 and max(yeard) > 1990 ORDER by rate DESC As the query has “group by” and “order by” plus multiple ranges in the where clause it will have to create a temporary table: id: 1 select_type: SIMPLE table: ontime type: index possible_keys: comb1 key: comb1 key_len: 9 ref: NULL rows: 148046200 Extra: Using where; Using temporary; Using filesort (for this query I’ve created the combined index: KEY comb1 (Carrier,YearD,ArrDelayMinutes) to increase performance) The query runs in ~15 minutes: +------------+------------+---------+----------+-----------------+------+ | min(yeard) | max(yeard) | Carrier | cnt | flights_delayed | rate | +------------+------------+---------+----------+-----------------+------+ | 2003 | 2009 | EV | 1454777 | 237698 | 0.16 | | 2006 | 2009 | XE | 1016010 | 152431 | 0.15 | | 2006 | 2009 | YV | 740608 | 110389 | 0.15 | | 2003 | 2009 | B6 | 683874 | 103677 | 0.15 | | 2003 | 2009 | FL | 1082489 | 158748 | 0.15 | | 2003 | 2005 | DH | 501056 | 69833 | 0.14 | | 2001 | 2009 | MQ | 3238137 | 448037 | 0.14 | | 2003 | 2006 | RU | 1007248 | 126733 | 0.13 | | 2004 | 2009 | OH | 1195868 | 160071 | 0.13 | | 2003 | 2006 | TZ | 136735 | 16496 | 0.12 | | 1988 | 2009 | UA | 9593284 | 1197053 | 0.12 | | 1988 | 2009 | AA | 10600509 | 1185343 | 0.11 | | 1988 | 2001 | TW | 2659963 | 280741 | 0.11 | | 1988 | 2009 | CO | 6029149 | 673863 | 0.11 | | 2007 | 2009 | 9E | 577244 | 59440 | 0.10 | | 1988 | 2009 | DL | 11869471 | 1156267 | 0.10 | | 1988 | 2009 | NW | 7601727 | 725460 | 0.10 | | 1988 | 2009 | AS | 1506003 | 146920 | 0.10 | | 2003 | 2009 | OO | 2654259 | 257069 | 0.10 | | 1988 | 2009 | US | 10276941 | 991016 | 0.10 | | 1988 | 1991 | PA | 206841 | 19465 | 0.09 | | 1988 | 2005 | HP | 2607603 | 235675 | 0.09 | | 1988 | 2009 | WN | 12722174 | 1107840 | 0.09 | | 2005 | 2009 | F9 | 307569 | 28679 | 0.09 | +------------+------------+---------+----------+-----------------+------+ 24 rows in set (15 min 56.40 sec) Now we can split this query and run the 31 queries (=31 distinct airlines in this table) in parallel. I have used the following script: date for c in '9E' 'AA' 'AL' 'AQ' 'AS' 'B6' 'CO' 'DH' 'DL' 'EA' 'EV' 'F9' 'FL' 'HA' 'HP' 'ML' 'MQ' 'NW' 'OH' 'OO' 'PA' 'PI' 'PS' 'RU' 'TW' 'TZ' 'UA' 'US' 'WN' 'XE' 'YV' do sql=" select min(yeard), max(yeard), Carrier, count(*) as cnt, sum(ArrDelayMinutes>30) as flights_delayed, round(sum(ArrDelayMinutes>30)/count(*),2) as rate FROM ontime WHERE DayOfWeek not in (6,7) and OriginState not in ('AK', 'HI', 'PR', 'VI') and DestState not in ('AK', 'HI', 'PR', 'VI') and flightdate < '2010-01-01' and carrier = '$c'" mysql -uroot -vvv ontime -e "$sql" &>par_sql_complex/$c.log & done wait date In this case we will also avoid creating temporary table (as we have an index which starts with carrier). Results: total time is 5 min 47 seconds (3x faster) Start: 15:41:02 EST 2013 End: 15:46:49 EST 2013 Per query statistics: par_sql_complex/9E.log:1 row in set (44.47 sec) par_sql_complex/AA.log:1 row in set (5 min 41.13 sec) par_sql_complex/AL.log:1 row in set (15.81 sec) par_sql_complex/AQ.log:1 row in set (14.52 sec) par_sql_complex/AS.log:1 row in set (2 min 43.01 sec) par_sql_complex/B6.log:1 row in set (1 min 26.06 sec) par_sql_complex/CO.log:1 row in set (3 min 58.07 sec) par_sql_complex/DH.log:1 row in set (31.30 sec) par_sql_complex/DL.log:1 row in set (5 min 47.07 sec) par_sql_complex/EA.log:1 row in set (28.58 sec) par_sql_complex/EV.log:1 row in set (2 min 6.87 sec) par_sql_complex/F9.log:1 row in set (46.18 sec) par_sql_complex/FL.log:1 row in set (1 min 30.83 sec) par_sql_complex/HA.log:1 row in set (39.42 sec) par_sql_complex/HP.log:1 row in set (2 min 45.57 sec) par_sql_complex/ML.log:1 row in set (4.64 sec) par_sql_complex/MQ.log:1 row in set (2 min 22.55 sec) par_sql_complex/NW.log:1 row in set (4 min 26.67 sec) par_sql_complex/OH.log:1 row in set (1 min 9.67 sec) par_sql_complex/OO.log:1 row in set (2 min 14.97 sec) par_sql_complex/PA.log:1 row in set (17.62 sec) par_sql_complex/PI.log:1 row in set (14.52 sec) par_sql_complex/PS.log:1 row in set (3.46 sec) par_sql_complex/RU.log:1 row in set (40.14 sec) par_sql_complex/TW.log:1 row in set (2 min 32.32 sec) par_sql_complex/TZ.log:1 row in set (14.16 sec) par_sql_complex/UA.log:1 row in set (4 min 55.18 sec) par_sql_complex/US.log:1 row in set (4 min 38.08 sec) par_sql_complex/WN.log:1 row in set (4 min 56.12 sec) par_sql_complex/XE.log:1 row in set (24.21 sec) par_sql_complex/YV.log:1 row in set (20.82 sec) As we can see there are large airlines (like AA, UA, US, DL, etc) which took most of the time. In this case the load will not be distributed evenly as in the previous example; however, by running the query in parallel we have got 3x times better response time on this server. CPU utilization: Cpu3 : 22.0%us, 1.2%sy, 0.0%ni, 74.4%id, 2.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 16.0%us, 0.0%sy, 0.0%ni, 84.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 39.0%us, 1.2%sy, 0.0%ni, 56.1%id, 3.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 33.3%us, 0.0%sy, 0.0%ni, 51.9%id, 13.6%wa, 0.0%hi, 1.2%si, 0.0%st Cpu7 : 33.3%us, 1.2%sy, 0.0%ni, 48.8%id, 16.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 24.7%us, 0.0%sy, 0.0%ni, 60.5%id, 14.8%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 : 24.4%us, 0.0%sy, 0.0%ni, 56.1%id, 19.5%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 : 40.7%us, 0.0%sy, 0.0%ni, 56.8%id, 2.5%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 19.5%us, 1.2%sy, 0.0%ni, 65.9%id, 12.2%wa, 0.0%hi, 1.2%si, 0.0%st Cpu12 : 40.2%us, 1.2%sy, 0.0%ni, 56.1%id, 2.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 82.7%us, 0.0%sy, 0.0%ni, 17.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 55.4%us, 0.0%sy, 0.0%ni, 43.4%id, 1.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 86.6%us, 0.0%sy, 0.0%ni, 13.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu16 : 61.0%us, 1.2%sy, 0.0%ni, 37.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 : 29.3%us, 1.2%sy, 0.0%ni, 69.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 18.8%us, 0.0%sy, 0.0%ni, 52.5%id, 28.8%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 : 14.3%us, 1.2%sy, 0.0%ni, 57.1%id, 27.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 : 12.3%us, 0.0%sy, 0.0%ni, 59.3%id, 28.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 : 10.7%us, 0.0%sy, 0.0%ni, 76.2%id, 11.9%wa, 0.0%hi, 1.2%si, 0.0%st Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 10.8%us, 2.4%sy, 0.0%ni, 71.1%id, 15.7%wa, 0.0%hi, 0.0%si, 0.0%st Note that in case of “order by” we will need to manually sort the results, however, sorting 10-100 rows will be fast. Conclusion Splitting a complex report into multiple queries and running it in parallel (asynchronously) can increase performance (3x to 10x in the above example) and will better utilize modern hardware. It is also possible to split the queries between multiple MySQL servers (i.e. MySQL slave servers) to further increase scalability (will require more coding).

April 25, 2015

by Peter Zaitsev

· 12,767 Views

JSF "Loading" JavaScript -- Brief Overview

What remains unchanged is the way that JavaScript enter in the scene via the or, as:

April 25, 2015

by Anghel Leonard

CORE

· 16,201 Views

Profiling MySQL Queries from Performance Schema

[This article was written by Jarvin Real] When optimizing queries and investigating performance issues, MySQL comes with built in support for profiling queries aka SET profiling=1; . This is already awesome and simple to use, but why the PERFORMANCE_SCHEMA alternative? Because profiling will be removed soon (already deprecated on MySQL 5.6 ad 5.7); the built-in profiling capability can only be enabled per session. This means that you cannot capture profiling information for queries running from other connections. If you are using Percona Server, the profiling option forlog_slow_verbosity is a nice alternative, unfortunately, not everyone is using Percona Server. Now, for a quick demo: I execute a simple query and profile it below. Note that all of these commands are executed from a single session to my test instance. mysql> SHOW PROFILES; +----------+------------+----------------------------------------+ | Query_ID | Duration | Query | +----------+------------+----------------------------------------+ | 1 | 0.00011150 | SELECT * FROM sysbench.sbtest1 LIMIT 1 | +----------+------------+----------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> SHOW PROFILE SOURCE FOR QUERY 1; +----------------------+----------+-----------------------+------------------+-------------+ | Status | Duration | Source_function | Source_file | Source_line | +----------------------+----------+-----------------------+------------------+-------------+ | starting | 0.000017 | NULL | NULL | NULL | | checking permissions | 0.000003 | check_access | sql_parse.cc | 5797 | | Opening tables | 0.000021 | open_tables | sql_base.cc | 5156 | | init | 0.000009 | mysql_prepare_select | sql_select.cc | 1050 | | System lock | 0.000005 | mysql_lock_tables | lock.cc | 306 | | optimizing | 0.000002 | optimize | sql_optimizer.cc | 138 | | statistics | 0.000006 | optimize | sql_optimizer.cc | 381 | | preparing | 0.000005 | optimize | sql_optimizer.cc | 504 | | executing | 0.000001 | exec | sql_executor.cc | 110 | | Sending data | 0.000025 | exec | sql_executor.cc | 190 | | end | 0.000002 | mysql_execute_select | sql_select.cc | 1105 | | query end | 0.000003 | mysql_execute_command | sql_parse.cc | 5465 | | closing tables | 0.000004 | mysql_execute_command | sql_parse.cc | 5544 | | freeing items | 0.000005 | mysql_parse | sql_parse.cc | 6969 | | cleaning up | 0.000006 | dispatch_command | sql_parse.cc | 1874 | +----------------------+----------+-----------------------+------------------+-------------+ 15 rows in set, 1 warning (0.00 sec) To demonstrate how we can achieve the same with Performance Schema, we first identify our current connection id. In the real world, you might want to get the connection/processlist id of the thread you want to watch i.e. from SHOW PROCESSLIST . mysql> SELECT THREAD_ID INTO @my_thread_id -> FROM threads WHERE PROCESSLIST_ID = CONNECTION_ID(); Query OK, 1 row affected (0.00 sec) Next, we identify the bounding EVENT_IDs for the statement stages. We will look for the statement we wanted to profile using the query below from the events_statements_history_long table. Your LIMIT clause may vary depending on how much queries the server might be getting. mysql> SELECT THREAD_ID, EVENT_ID, END_EVENT_ID, SQL_TEXT, NESTING_EVENT_ID -> FROM events_statements_history_long -> WHERE THREAD_ID = @my_thread_id -> AND EVENT_NAME = 'statement/sql/select' -> ORDER BY EVENT_ID DESC LIMIT 3 G *************************** 1. row *************************** THREAD_ID: 13848 EVENT_ID: 419 END_EVENT_ID: 434 SQL_TEXT: SELECT THREAD_ID INTO @my_thread_id FROM threads WHERE PROCESSLIST_ID = CONNECTION_ID() NESTING_EVENT_ID: NULL *************************** 2. row *************************** THREAD_ID: 13848 EVENT_ID: 374 END_EVENT_ID: 392 SQL_TEXT: SELECT * FROM sysbench.sbtest1 LIMIT 1 NESTING_EVENT_ID: NULL *************************** 3. row *************************** THREAD_ID: 13848 EVENT_ID: 353 END_EVENT_ID: 364 SQL_TEXT: select @@version_comment limit 1 NESTING_EVENT_ID: NULL 3 rows in set (0.02 sec) From the results above, we are mostly interested with the EVENT_ID and END_EVENT_ID values from the second row, this will give us the stage events of this particular query from the events_stages_history_long table. mysql> SELECT EVENT_NAME, SOURCE, (TIMER_END-TIMER_START)/1000000000 as 'DURATION (ms)' -> FROM events_stages_history_long -> WHERE THREAD_ID = @my_thread_id AND EVENT_ID BETWEEN 374 AND 392; +--------------------------------+----------------------+---------------+ | EVENT_NAME | SOURCE | DURATION (ms) | +--------------------------------+----------------------+---------------+ | stage/sql/init | mysqld.cc:998 | 0.0214 | | stage/sql/checking permissions | sql_parse.cc:5797 | 0.0023 | | stage/sql/Opening tables | sql_base.cc:5156 | 0.0205 | | stage/sql/init | sql_select.cc:1050 | 0.0089 | | stage/sql/System lock | lock.cc:306 | 0.0047 | | stage/sql/optimizing | sql_optimizer.cc:138 | 0.0016 | | stage/sql/statistics | sql_optimizer.cc:381 | 0.0058 | | stage/sql/preparing | sql_optimizer.cc:504 | 0.0044 | | stage/sql/executing | sql_executor.cc:110 | 0.0008 | | stage/sql/Sending data | sql_executor.cc:190 | 0.0251 | | stage/sql/end | sql_select.cc:1105 | 0.0017 | | stage/sql/query end | sql_parse.cc:5465 | 0.0031 | | stage/sql/closing tables | sql_parse.cc:5544 | 0.0037 | | stage/sql/freeing items | sql_parse.cc:6969 | 0.0056 | | stage/sql/cleaning up | sql_parse.cc:1874 | 0.0006 | +--------------------------------+----------------------+---------------+ 15 rows in set (0.01 sec) As you can see the results are pretty close, not exactly the same but close. SHOW PROFILE shows Duration in seconds, while the results above is in milliseconds. Some limitations to this method though: As we’ve seen it takes a few hoops to dish out the information we need. Because we have to identify the statement we have to profile manually, this procedure may not be easy to port into tools like the sys schema or pstop. Only possible if Performance Schema is enabled (by default its enabled since MySQL 5.6.6, yay!) Does not cover all metrics compared to the native profiling i.e. CONTEXT SWITCHES, BLOCK IO, SWAPS Depending on how busy the server you are running the tests, the sizes of the history tables may be too small, as such you either have to increase or loose the history to early i.e.performance_schema_events_stages_history_long_size variable. Using ps_history might help in this case though with a little modification to the queries. The resulting Duration per event may vary, I would think this may be due to the additional as described on performance_timers table. In any case we hope to get this cleared up as result whenthis bug is fixed.

April 18, 2015

by Peter Zaitsev

· 7,694 Views

Using Multiple Grok Statements to Parse a Java Stack Trace

Parse your Java stack trace log information with the Logstash tool.

April 14, 2015

by Bipin Patwardhan

· 76,920 Views · 6 Likes

MQ Trace on Java MQ Clients

I find it very difficult to debug the Java client application written for MQ. Most of the error messages are self explanatory but some of the error messages are difficult to diagnose and find the cause. Enable tracing would really help to identify the root of the problem. In this post, we will see how to enable tracing on Java applications. MQ Trace Simple MQ Trace enabling can be done by adding a JVM parameter like this java -Dcom.ibm.mq.commonservices=trace.properties ... Create trace.properties with the following attributes Diagnostics.MQ=enabled Diagnostics.Java=explorer,wmqjavaclasses,all Diagnostics.Java.Trace.Detail=high Diagnostics.Java.Trace.Destination.File=enabled Diagnostics.Java.Trace.Destination.Console=disabled Diagnostics.Java.Trace.Destination.Pathname=/tmp/trace Diagnostics.Java.FFDC.Destination.Pathname=/tmp/FFDC Diagnostics.Java.Errors.Destination.Filename=/tmp/errors/AMQJERR.LOG oints to be noted Add libraries of IBM JRE (From the directory $JRE_HOME/lib) to CLASSPATH Create the directories /tmp/FFDC, /tmp/trace and /tmp/errors Add read permission to trace.properties Keep an eye on the log file size and make sure to disable logging when not required. Otherwise, it will keep filling the disk space. (Logs size will become some gigs in few hours easily) One you start the Java Client, it will start printing the trace into the mentioned files. If you want to disable the tracing, make the Diagnostics.MQ property value to disabled so that trace will be stopped. JSSE Debug In case of secure connection with MQ, if you want to enable debug for JSSE, then you can use the JVM parameter java.net.debug with different values -Djavax.net.debug=true This prints full trace of JSSE, this can be limited to only handshake by changing the value of the parameter to ssl:handshake -Djavax.net.debug=ssl:handshake This prints only the trace related to handshake, all other trace will be simply ignored. For more articles read my blog . Happy Learning!!!!

April 14, 2015

by Veeresham Kardas

· 6,808 Views

Mockito & DBUnit: Implementing a Mocking Structure Focused and Independent to Automated Tests on Java

On this post, we will make a hands-on about Mockito and DBUnit, two libraries from Java's open source ecosystem which can help us in improving our JUnit tests on focus and independence. But why mocking is so important on our unit tests? Focusing the tests Let's imagine a Java back-end application with a tier-like architecture. On this application, we could have 2 tiers: The service tier, which have the business rules and make as a interface for the front-end; The entity tier, which have the logic responsible for making calls to a database, utilizing techonologies like JDBC or JPA; Of course, on a architecture of this kind, we will have the following dependence of our tiers: Service >>> Entity On this kind of architecture, the most common way of building our automated tests is by creating JUnit Test Classes which test each tier independently, thus we can make running tests that reflect only the correctness of the tier we want to test. However, if we simply create the classes without any mocking, we will got problems like the following: On the JUnit tests of our service tier, for example, if we have a problem on the entity tier, we will have also our tests failed, because the error from the entity tier will reverberate across the tiers; If we have a project where different teams are working on the same system, and one team is responsible for the construction of the service tier, while the other is responsible for the construction of the entity tier, we will have a dependency of one team with the other before the tests could be made; To resolve such issues, we could mock the entity tier on the service tier's unit test classes, so we can have independence and focus of our tests on the service tier, which it belongs. independence One point that it is specially important when we make our JUnit test classes in the independence department is the entity tier. Since in our example this tier is focused in the connection and running of SQL commands on a database, it makes a break on our independence goal, since we will need a database so we can run our tests. Not only that, if a test breaks any structure that it is used by the subsequent tests, all of them will also fail. It is on this point that enters our other library, DBUnit. With DBUnit, we can use embedded databases, such as HSQLDB, to make our database exclusive to the running of our tests. So, without further delay, let's begin our hands-on! Hands-on For this lab, we will create a basic CRUD for a Client entity. The structure will follow the simple example we talked about previously, with the DAO (entity) and Service tiers. We will use DBUnit and JUnit to test the DAO tier, and Mockito with JUnit to test the Service tier. First, let's create a Maven project, without any archetype and include the following dependencies on pom.xml: . . . junit junit 4.12 org.dbunit dbunit 2.5.0 org.mockito mockito-all 1.10.19 org.hibernate hibernate-entitymanager 4.3.8.Final org.hsqldb hsqldb 2.3.2 org.springframework spring-core 4.1.4.RELEASE org.springframework spring-context 4.1.5.RELEASE org.springframework spring-test 4.1.5.RELEASE org.springframework spring-tx 4.1.5.RELEASE org.springframework spring-orm 4.1.5.RELEASE . . . On the previous snapshot, we included not only the Mockito, DBUnit and JUnit libraries, but we also included Hibernate to implement the persistence layer and Spring 4 to use the IoC container and the transaction management. We also included the Spring Test library, which includes some features that we will use later on this lab. Finally, to simplify the setup and remove the need of installing a database to run the code, we will use HSQLDB as our database. Our lab will have the following structure: One class will represent the application itself, as a standalone class, where we will consume the tiers, like a real application would do; We will have another 2 classes, each one with JUnit tests, that will test each tier independently; First, we define a persistence unit, where we define the name of the unit and the properties to make Hibernate create the table for us and populate her with some initial rows. The code of the persistence.xml can be seen bellow: com.alexandreesl.handson.model.Client And the initial data to populate the table can be seen bellow: insert into Client(id,name,sex, phone) values (1,'Alexandre Eleuterio Santos Lourenco','M','22323456'); insert into Client(id,name,sex, phone) values (2,'Lucebiane Santos Lourenco','F','22323876'); insert into Client(id,name,sex, phone) values (3,'Maria Odete dos Santos Lourenco','F','22309456'); insert into Client(id,name,sex, phone) values (4,'Eleuterio da Silva Lourenco','M','22323956'); insert into Client(id,name,sex, phone) values (5,'Ana Carolina Fernandes do Sim','F','22123456'); In order to not making the post burdensome, we will not discuss the project structure during the lab, but just show the final structure at the end. The code can be found on a Github repository, at the end of the post. With the persistence unit defined, we can start coding! First, we create the entity class: package com.alexandreesl.handson.model; import javax.persistence.Column; import javax.persistence.Entity; import javax.persistence.Id; import javax.persistence.Table; @Table(name = "Client") @Entity public class Client { @Id private long id; @Column(name = "name", nullable = false, length = 50) private String name; @Column(name = "sex", nullable = false) private String sex; @Column(name = "phone", nullable = false) private long phone; public long getId() { return id; } public void setId(long id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getSex() { return sex; } public void setSex(String sex) { this.sex = sex; } public long getPhone() { return phone; } public void setPhone(long phone) { this.phone = phone; } } In order to create the persistence-related beans to enable Hibernate and the transaction manager, alongside all the rest of the beans necessary for the application, we use a Java-based Spring configuration class. The code of the class can be seen bellow: package com.alexandreesl.handson.core; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.ComponentScan; import org.springframework.context.annotation.Configuration; import org.springframework.jdbc.datasource.DriverManagerDataSource; import org.springframework.orm.jpa.JpaTransactionManager; import org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean; import org.springframework.orm.jpa.vendor.Database; import org.springframework.orm.jpa.vendor.HibernateJpaDialect; import org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter; import org.springframework.transaction.annotation.EnableTransactionManagement; @Configuration @EnableTransactionManagement @ComponentScan({ "com.alexandreesl.handson.dao", "com.alexandreesl.handson.service" }) public class AppConfiguration { @Bean public DriverManagerDataSource dataSource() { DriverManagerDataSource dataSource = new DriverManagerDataSource(); dataSource.setDriverClassName("org.hsqldb.jdbcDriver"); dataSource.setUrl("jdbc:hsqldb:mem://standalone"); dataSource.setUsername("sa"); dataSource.setPassword(""); return dataSource; } @Bean public JpaTransactionManager transactionManager() { JpaTransactionManager transactionManager = new JpaTransactionManager(); transactionManager.setEntityManagerFactory(entityManagerFactory() .getNativeEntityManagerFactory()); transactionManager.setDataSource(dataSource()); transactionManager.setJpaDialect(jpaDialect()); return transactionManager; } @Bean public HibernateJpaDialect jpaDialect() { return new HibernateJpaDialect(); } @Bean public HibernateJpaVendorAdapter jpaVendorAdapter() { HibernateJpaVendorAdapter jpaVendor = new HibernateJpaVendorAdapter(); jpaVendor.setDatabase(Database.HSQL); jpaVendor.setDatabasePlatform("org.hibernate.dialect.HSQLDialect"); return jpaVendor; } @Bean public LocalContainerEntityManagerFactoryBean entityManagerFactory() { LocalContainerEntityManagerFactoryBean entityManagerFactory = new LocalContainerEntityManagerFactoryBean(); entityManagerFactory .setPersistenceXmlLocation("classpath:META-INF/persistence.xml"); entityManagerFactory.setPersistenceUnitName("persistence"); entityManagerFactory.setDataSource(dataSource()); entityManagerFactory.setJpaVendorAdapter(jpaVendorAdapter()); entityManagerFactory.setJpaDialect(jpaDialect()); return entityManagerFactory; } } And finally, we create the classes that represent the tiers itself. This is the DAO class: package com.alexandreesl.handson.dao; import javax.persistence.EntityManager; import javax.persistence.PersistenceContext; import org.springframework.stereotype.Component; import org.springframework.transaction.annotation.Transactional; import com.alexandreesl.handson.model.Client; @Component public class ClientDAO { @PersistenceContext private EntityManager entityManager; @Transactional(readOnly = true) public Client find(long id) { return entityManager.find(Client.class, id); } @Transactional public void create(Client client) { entityManager.persist(client); } @Transactional public void update(Client client) { entityManager.merge(client); } @Transactional public void delete(Client client) { entityManager.remove(client); } } And this is the service class: package com.alexandreesl.handson.service; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Component; import com.alexandreesl.handson.dao.ClientDAO; import com.alexandreesl.handson.model.Client; @Component public class ClientService { @Autowired private ClientDAO clientDAO; public ClientDAO getClientDAO() { return clientDAO; } public void setClientDAO(ClientDAO clientDAO) { this.clientDAO = clientDAO; } public Client find(long id) { return clientDAO.find(id); } public void create(Client client) { clientDAO.create(client); } public void update(Client client) { clientDAO.update(client); } public void delete(Client client) { clientDAO.delete(client); } } The reader may notice that we created a getter/setter to the DAO class on the Service class. This is not necessary for the Spring injection, but we made this way to get easier to change the real DAO by a Mockito's mock on the tests class. Finally, we code the class we talked about previously, the one that consume the tiers: package com.alexandreesl.handson.core; import org.springframework.context.ApplicationContext; import org.springframework.context.annotation.AnnotationConfigApplicationContext; import com.alexandreesl.handson.model.Client; import com.alexandreesl.handson.service.ClientService; public class App { public static void main(String[] args) { ApplicationContext context = new AnnotationConfigApplicationContext( AppConfiguration.class); ClientService service = (ClientService) context .getBean(ClientService.class); System.out.println(service.find(1).getName()); System.out.println(service.find(3).getName()); System.out.println(service.find(5).getName()); Client client = new Client(); client.setId(6); client.setName("Celina do Sim"); client.setPhone(44657688); client.setSex("F"); service.create(client); System.out.println(service.find(6).getName()); System.exit(0); } } If we run the class, we can see that the console print all the clients we searched for and that Hibernate is initialized properly, proving our implementation is a success: Mar 28, 2015 1:09:22 PM org.springframework.context.annotation.AnnotationConfigApplicationContext prepareRefresh INFO: Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@6433a2: startup date [Sat Mar 28 13:09:22 BRT 2015]; root of context hierarchy Mar 28, 2015 1:09:22 PM org.springframework.jdbc.datasource.DriverManagerDataSource setDriverClassName INFO: Loaded JDBC driver: org.hsqldb.jdbcDriver Mar 28, 2015 1:09:22 PM org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean createNativeEntityManagerFactory INFO: Building JPA container EntityManagerFactory for persistence unit 'persistence' Mar 28, 2015 1:09:22 PM org.hibernate.jpa.internal.util.LogHelper logPersistenceUnitInformation INFO: HHH000204: Processing PersistenceUnitInfo [ name: persistence ...] Mar 28, 2015 1:09:22 PM org.hibernate.Version logVersion INFO: HHH000412: Hibernate Core {4.3.8.Final} Mar 28, 2015 1:09:22 PM org.hibernate.cfg.Environment INFO: HHH000206: hibernate.properties not found Mar 28, 2015 1:09:22 PM org.hibernate.cfg.Environment buildBytecodeProvider INFO: HHH000021: Bytecode provider name : javassist Mar 28, 2015 1:09:22 PM org.hibernate.annotations.common.reflection.java.JavaReflectionManager INFO: HCANN000001: Hibernate Commons Annotations {4.0.5.Final} Mar 28, 2015 1:09:23 PM org.hibernate.dialect.Dialect INFO: HHH000400: Using dialect: org.hibernate.dialect.HSQLDialect Mar 28, 2015 1:09:23 PM org.hibernate.hql.internal.ast.ASTQueryTranslatorFactory INFO: HHH000397: Using ASTQueryTranslatorFactory Mar 28, 2015 1:09:23 PM org.hibernate.tool.hbm2ddl.SchemaExport execute INFO: HHH000227: Running hbm2ddl schema export Mar 28, 2015 1:09:23 PM org.hibernate.tool.hbm2ddl.SchemaExport execute INFO: HHH000230: Schema export complete Alexandre Eleuterio Santos Lourenco Maria Odete dos Santos Lourenco Ana Carolina Fernandes do Sim Celina do Sim Now, let's move on for the tests themselves. For the DBUnit tests, we create a Base class, which will provide the base DB operations which all of our JUnit tests will benefit. On the @PostConstruct method, which is fired after all the injections of the Spring context are made - reason why we couldn't use the @BeforeClass annotation, because we need Spring to instantiate and inject the EntityManager first - we use DBUnit to make a connection to our database, with the class DatabaseConnection and populate the table using the DataSet class we created, passing a XML structure that represents the data used on the tests. This operation of populating the table is made by the DatabaseOperation class, which we use with the CLEAN_INSERT operation, that truncate the table first and them insert the data on the dataset. Finally, we use one of JUnit's event listeners, the @After event, which is called after every test case. On our scenario, we use this event to call the clear() method on the EntityManager, which forces Hibernate to query against the Database for the first time at every test case, thus eliminating possible problems we could have between our test cases because of data that it is different on the second level cache than it is on the DB. The code for the base class is the following: package com.alexandreesl.handson.dao.test; import java.io.InputStream; import java.sql.SQLException; import javax.annotation.PostConstruct; import javax.persistence.EntityManager; import javax.persistence.EntityManagerFactory; import javax.persistence.PersistenceUnit; import org.dbunit.DatabaseUnitException; import org.dbunit.database.DatabaseConfig; import org.dbunit.database.DatabaseConnection; import org.dbunit.database.IDatabaseConnection; import org.dbunit.dataset.IDataSet; import org.dbunit.dataset.xml.FlatXmlDataSetBuilder; import org.dbunit.ext.hsqldb.HsqldbDataTypeFactory; import org.dbunit.operation.DatabaseOperation; import org.hibernate.HibernateException; import org.hibernate.internal.SessionImpl; import org.junit.After; public class BaseDBUnitSetup { private static IDatabaseConnection connection; private static IDataSet dataset; @PersistenceUnit public EntityManagerFactory entityManagerFactory; private EntityManager entityManager; @PostConstruct public void init() throws HibernateException, DatabaseUnitException, SQLException { entityManager = entityManagerFactory.createEntityManager(); connection = new DatabaseConnection( ((SessionImpl) (entityManager.getDelegate())).connection()); connection.getConfig().setProperty( DatabaseConfig.PROPERTY_DATATYPE_FACTORY, new HsqldbDataTypeFactory()); FlatXmlDataSetBuilder flatXmlDataSetBuilder = new FlatXmlDataSetBuilder(); InputStream dataSet = Thread.currentThread().getContextClassLoader() .getResourceAsStream("test-data.xml"); dataset = flatXmlDataSetBuilder.build(dataSet); DatabaseOperation.CLEAN_INSERT.execute(connection, dataset); } @After public void afterTests() { entityManager.clear(); } } The xml structure used on the test cases is the following: And the code of our test class of the DAO tier is the following: package com.alexandreesl.handson.dao.test; import static org.junit.Assert.assertNotNull; import static org.junit.Assert.assertNull; import org.junit.Test; import org.junit.runner.RunWith; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.test.context.ContextConfiguration; import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; import org.springframework.test.context.transaction.TransactionConfiguration; import org.springframework.transaction.annotation.Transactional; import com.alexandreesl.handson.core.test.AppTestConfiguration; import com.alexandreesl.handson.dao.ClientDAO; import com.alexandreesl.handson.model.Client; @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration(classes = AppTestConfiguration.class) @TransactionConfiguration(defaultRollback = true) public class ClientDAOTest extends BaseDBUnitSetup { @Autowired private ClientDAO clientDAO; @Test public void testFind() { Client client = clientDAO.find(1); assertNotNull(client); client = clientDAO.find(2); assertNotNull(client); client = clientDAO.find(3); assertNull(client); client = clientDAO.find(4); assertNull(client); client = clientDAO.find(5); assertNull(client); } @Test @Transactional public void testInsert() { Client client = new Client(); client.setId(3); client.setName("Celina do Sim"); client.setPhone(44657688); client.setSex("F"); clientDAO.create(client); } @Test @Transactional public void testUpdate() { Client client = clientDAO.find(1); client.setPhone(12345678); clientDAO.update(client); } @Test @Transactional public void testRemove() { Client client = clientDAO.find(1); clientDAO.delete(client); } } The code is very self explanatory so we will just focus on explaining the annotations at the top-level class. The @RunWith(SpringJUnit4ClassRunner.class) annotationchanges the JUnit base class that runs our test cases, using rather one made by Spring that enable support of the IoC container and the Spring's annotations. The @TransactionConfiguration(defaultRollback = true) annotation is from Spring's test library and change the behavior of the @Transactional annotation, making the transactions to roll back after execution, instead of a commit. That ensures that our test cases wont change the structure of the DB, so a test case wont break the execution of his followers. The reader may notice that we changed the configuration class to another one, exclusive for the test cases. It is essentially the same beans we created on the original configuration class, just changing the database bean to point to a different DB then the previously one, showing that we can change the database of our tests without breaking the code. On a real world scenario, the configuration class of the application would be pointing to a relational database like Oracle, DB2, etc and the test cases would use a embedded database such as HSQLDB, which we are using on this case: package com.alexandreesl.handson.core.test; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.ComponentScan; import org.springframework.context.annotation.Configuration; import org.springframework.jdbc.datasource.DriverManagerDataSource; import org.springframework.orm.jpa.JpaTransactionManager; import org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean; import org.springframework.orm.jpa.vendor.Database; import org.springframework.orm.jpa.vendor.HibernateJpaDialect; import org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter; import org.springframework.transaction.annotation.EnableTransactionManagement; @Configuration @EnableTransactionManagement @ComponentScan({ "com.alexandreesl.handson.dao", "com.alexandreesl.handson.service" }) public class AppTestConfiguration { @Bean public DriverManagerDataSource dataSource() { DriverManagerDataSource dataSource = new DriverManagerDataSource(); dataSource.setDriverClassName("org.hsqldb.jdbcDriver"); dataSource.setUrl("jdbc:hsqldb:mem://standalone-test"); dataSource.setUsername("sa"); dataSource.setPassword(""); return dataSource; } @Bean public JpaTransactionManager transactionManager() { JpaTransactionManager transactionManager = new JpaTransactionManager(); transactionManager.setEntityManagerFactory(entityManagerFactory() .getNativeEntityManagerFactory()); transactionManager.setDataSource(dataSource()); transactionManager.setJpaDialect(jpaDialect()); return transactionManager; } @Bean public HibernateJpaDialect jpaDialect() { return new HibernateJpaDialect(); } @Bean public HibernateJpaVendorAdapter jpaVendorAdapter() { HibernateJpaVendorAdapter jpaVendor = new HibernateJpaVendorAdapter(); jpaVendor.setDatabase(Database.HSQL); jpaVendor.setDatabasePlatform("org.hibernate.dialect.HSQLDialect"); return jpaVendor; } @Bean public LocalContainerEntityManagerFactoryBean entityManagerFactory() { LocalContainerEntityManagerFactoryBean entityManagerFactory = new LocalContainerEntityManagerFactoryBean(); entityManagerFactory .setPersistenceXmlLocation("classpath:META-INF/persistence.xml"); entityManagerFactory.setPersistenceUnitName("persistence"); entityManagerFactory.setDataSource(dataSource()); entityManagerFactory.setJpaVendorAdapter(jpaVendorAdapter()); entityManagerFactory.setJpaDialect(jpaDialect()); return entityManagerFactory; } } If we run the test class, we can see that it runs the test cases successfully, showing that our code is a success. If we see the console, we can see that transactions were created and rolled back, respecting our configuration: . . . ar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext startTransaction INFO: Began transaction (1) for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@1a411233, testMethod = testInsert@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]; transaction manager [org.springframework.orm.jpa.JpaTransactionManager@7c2327fa]; rollback [true] Mar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext endTransaction INFO: Rolled back transaction for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@1a411233, testMethod = testInsert@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]. Mar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext startTransaction INFO: Began transaction (1) for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@2adddc06, testMethod = testRemove@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]; transaction manager [org.springframework.orm.jpa.JpaTransactionManager@7c2327fa]; rollback [true] Mar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext endTransaction INFO: Rolled back transaction for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@2adddc06, testMethod = testRemove@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]. Mar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext startTransaction INFO: Began transaction (1) for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@4905c46b, testMethod = testUpdate@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]; transaction manager [org.springframework.orm.jpa.JpaTransactionManager@7c2327fa]; rollback [true] Mar 28, 2015 2:29:55 PM org.springframework.test.context.transaction.TransactionContext endTransaction INFO: Rolled back transaction for test context [DefaultTestContext@644abb8f testClass = ClientDAOTest, testInstance = com.alexandreesl.handson.dao.test.ClientDAOTest@4905c46b, testMethod = testUpdate@ClientDAOTest, testException = [null], mergedContextConfiguration = [MergedContextConfiguration@70325d20 testClass = ClientDAOTest, locations = '{}', classes = '{class com.alexandreesl.handson.core.test.AppTestConfiguration}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{}', contextLoader = 'org.springframework.test.context.support.DelegatingSmartContextLoader', parent = [null]]]. Now let's move on to the Service tests, with the help of Mockito. The class to test the Service tier is very simple, as we can see bellow: package com.alexandreesl.handson.service.test; import static org.junit.Assert.assertEquals; import org.junit.BeforeClass; import org.junit.Test; import org.mockito.Mockito; import org.mockito.invocation.InvocationOnMock; import org.mockito.stubbing.Answer; import com.alexandreesl.handson.dao.ClientDAO; import com.alexandreesl.handson.model.Client; import com.alexandreesl.handson.service.ClientService; public class ClientServiceTest { private static ClientDAO clientDAO; private static ClientService clientService; @BeforeClass public static void beforeClass() { clientService = new ClientService(); clientDAO = Mockito.mock(ClientDAO.class); clientService.setClientDAO(clientDAO); Client client = new Client(); client.setId(0); client.setName("Mocked client!"); client.setPhone(11111111); client.setSex("M"); Mockito.when(clientDAO.find(Mockito.anyLong())).thenReturn(client); Mockito.doThrow(new RuntimeException("error on client!")) .when(clientDAO).delete((Client) Mockito.any()); Mockito.doNothing().when(clientDAO).create((Client) Mockito.any()); Mockito.doAnswer(new Answer

April 14, 2015

by Alexandre Lourenco

· 21,371 Views · 2 Likes

max_allowed_packet and Binary Log Corruption in MySQL

[This article was written by Miguel Angel Nieto] The combination of max_allowed_packet variable and replication in MySQL is a common source of headaches. In a nutshell, max_allowed_packet is the maximum size of a MySQL network protocol packet that the server can create or read. It has a default value of 1MB (<= 5.6.5) or 4MB (>= 5.6.6) and a maximum size of 1GB. This adds some constraints in our replication environment: The master server shouldn’t write events to the binary log larger than max_allowed_packet All the slaves in the replication chain should have the same max_allowed_packet as the master server Sometimes, even following those two basic rules we can have problems. For example, there are situations (also called bugs) where the master writes more data than the max_allowed_packet limit causing the slaves to stop working. In order to fix this Oracle created a new variable called slave_max_allowed_packet. This new configuration variable available from 5.1.64, 5.5.26 and 5.6.6 overrides the max_allowed_packet value for slave threads. Therefore, regardless of the max_allowed_packet value the slaves’ threads will have 1GB limit, the default value of slave_max_allowed_packet. Nice trick that works as expected. Sometimes even with that workaround we can get the max_allowed_packet error in the slave servers. That means that there is a packet larger than 1GB, something that shouldn’t happen in a normal situation. Why? Usually it is caused by a binary log corruption. Let’s see the following example: Slave stops working with the following message: Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master' The important part is “got fatal error 1236 from master”. The master cannot read the event it wrote to the binary log seconds ago. To check the problem we can: Use mysqlbinlog to read the binary log from the position it failed with –start-position. This is an example taken from our Percona Forums: #121003 5:22:26 server id 1 end_log_pos 398528 # Unknown event # at 398528 #960218 6:48:44 server id 1813111337 end_log_pos 1835008 # Unknown event ERROR: Error in Log_event::read_log_event(): 'Event too big', data_len: 1953066613, event_type: 8 DELIMITER ; # End of log file Check the size of the event, 1953066613 bytes. Or the “Unknown event” messages. Something is clearly wrong there. Another usual thing to check is the server id that sometimes doesn’t correspond with the real value. In this example the person who posted the binary log event confirmed that the server id was wrong. Check master’s error log. [ERROR] Error in Log_event::read_log_event(): 'Event too big', data_len: 1953066613, event_type: 8 Again, the event is bigger than expected. There is no way the master and slave can read/write it, so the solution is to skip that event in the slave and rotate the logs on the master. Then, use pt-table-checksum to check data consistency. MySQL 5.6 includes replication checksums to avoid problems with log corruptions. You can read more about it in Stephan’s blog post. Conclusion Errors on slave servers about max_allowed_packet can be caused by very different reasons. Although binary log corruption is not a common one, it is something worth checking when you have run out of ideas.

April 13, 2015

by Peter Zaitsev

· 14,265 Views · 1 Like

Live Activity Monitoring of NGINX Plus in 3 Simple Steps

[This article was written by Nick Shadrin] One of the most popular features in NGINX Plus is live activity monitoring, also known as extended status reporting. The live activity monitor reports real-time statistics for your NGINX Plus installation, which is essential for troubleshooting, error reporting, monitoring new installations and updates to a production environment, and cache management. We often get questions from DevOps engineers – experienced and new to NGINX Plus alike – about the best way to configure live activity monitoring. In this post, we’ll describe a sample configuration file that will have you viewing real-time statistics on the NGINX Plus dashboard in just a few minutes. The file is a preview of the set of sample configuration files that we’re introducing in NGINX Plus R6 to make it even easier to set up many of NGINX Plus’ advanced features in your environment. The set of examples will grow over time, and the NGINX Plus packages available at cs.nginx.com will include the latest versions available when the packages are created. You can download the first sample configuration file, for live activity monitoring, in advance of the release of NGINX Plus R6 next week (the file works with NGINX Plus R5 too). Here we’ll review the instructions for installing and customizing the file. Note: These instructions assume that you use the conventional NGINX Plus configuration scheme (in which configuration files are stored in the /etc/nginx/conf.d directory), which is set up automatically when you install an NGINX Plus package. If you use a different scheme, adjust the commands accordingly. Installating the Configuration File The commands do not include prompts or other extraneous characters, so you can cut and paste them directly into your terminal window. Download the sample configuration file and rename it to status.conf. cd /etc/nginx/conf.d/ curl http://nginx.com/resources/conf/status.txt > /etc/nginx/conf.d/status.conf Customize your configuration files as instructed in Customizing the Configuration. Test the configuration file for syntactic validity and reload NGINX Plus. nginx -t && nginx -s reload The NGINX Plus status dashboard is available immediately at http://nginx-server-address:8080/ (or the alternate port number you configure as described in Changing the Port for the Status Dashboard.). Customizing the Configuration To get the most out of live activity monitoring, make the changes described in this section to both the sample configuration file and your existing configuration files. Monitoring Servers and Upstream Server Groups For statistics about virtual servers and upstream groups to appear on the dashboard, you must enable a shared memory zone in the configuration block for each server and group. The shared memory is used to store configuration and run-time state information referenced by the NGINX Plus worker processes. If you don’t configure shared memory, the dashboard reports only basic information about the number of connections and requests, plus caching statistics. In the figure in Preview of the NGINX Plus R6 Dashboard, this corresponds to the first line (to the right of the NGINX+ logo) and the finalCaches section. Edit your existing configuration files to add the status_zone directive to the server configuration block for each server you want to appear on the dashboard. (You can specify the same zone name in multiple server blocks, in which case the statistics for those servers are aggregated together in the dashboard.) server { listen 80; status_zone backend-servers; location / { proxy_pass http://backend; } } Similarly, you must add the zone directive to the upstream configuration block for each upstream group you want to appear on the dashboard. The following example allocates 64 KB of shared memory for the two servers in the upstream-backend group. The zone name for each upstream group must be unique. upstream backend { zone upstream-backend 64k; server 10.2.3.5; server 10.2.3.6; } Restricting Access to the Dashboard The default settings in the sample configuration file allow anyone on any network to access the dashboard. We strongly recommend that you configure at least one of the following security measures: IP address-based access control lists (ACLs). In the sample configuration file, uncomment theallow and deny directives, and substitute the address of your administrative network for10.0.0.0/8. Only users on the specified network can access the status page. allow 10.0.0.0/8; deny all; HTTP basic authentication. In the sample configuration file, uncomment the auth_basic andauth_basic_user_file directives and add user entries to the /etc/nginx/users file (for example, by using an htpasswd generator). If you have an Apache installation, another option is to reuse an existing htpasswd file. auth_basic on; auth_basic_user_file /etc/nginx/users; Client certificates, which are part of a complete configuration of SSL or TLS. For more information, see NGINX SSL Termination in the NGINX Plus Admin Guide and the documentation for the HTTP SSL module. Firewall. Configure your firewall to disallow outside access to the port for the dashboard (8080 in the sample configuration file). Changing the Port for the Status Dashboard To set the port number for the dashboard to a value other than the default 8080, edit the followinglisten directive in the sample configuration file. listen 8080; Limiting the Monitored IP Addresses If your NGINX Plus server has several IP addresses and you want the dashboard to display tatistics for only some of them, create a listen directive for each one that specifies its address and port. The sample configuration script includes the following example that you can uncomment and change to set the correct IP address. You also need to comment out the listen directive (with a port number only) discussed in Changing the Port for the Status Dashboard. listen 10.2.3.4:8080; Preview of the NGINX Plus R6 Dashboard Here’s a sneak peek at the new NGINX Plus dashboard, which we’re unveiling next week in NGINX Plus R6. More Information about Live Activity Monitoring Live Activity Monitoring of NGINX Plus in the NGINX Plus Admin Guide HTTP Status module documentation

April 10, 2015

by Patrick Nommensen

· 6,451 Views

Currency Format Validation and Parsing

In Java, formatting a number according to a locale-specific currency format is pretty simple. You use an instance of java.text.NumberFormat class by instantiating it through NumberFormat.getCurrencyInstance() and invoke one of the format() methods. Following is a code-snippet from https://docs.oracle.com/javase/tutorial/i18n/format/numberFormat.html static public void displayCurrency( Locale currentLocale) { Double currencyAmount = new Double(9876543.21); Currency currentCurrency = Currency.getInstance(currentLocale); NumberFormat currencyFormatter = NumberFormat.getCurrencyInstance(currentLocale); System.out.println( currentLocale.getDisplayName() + ", " + currentCurrency.getDisplayName() + ": " + currencyFormatter.format(currencyAmount)); } However, if an application allows users to enter an amount as a string using separators and currency symbols, there is quite a possibility that currency format may not have been followed according to the locale. For example, user can use a wrong thousand or decimal separator, or a wrong currency symbol altogether. In that case, the application should incorporate a mechanism to first ensure that format is followed and then parse that string to convert it into a proper number in order to perform several mathematical calculations in a currency format independent manner. It is important to note that parsing a string using java.text.NumberFormat#public Number parse(String source) method requires that the currency string must contain a value following the locale-specific pattern defined in java.text.DecimalFormat and symbols defined in java.text.DecimalFormatSymbols. If the pattern and/or symbols are not followed, program will throw java.text.ParseException. For example, currency pattern for it_CH = Italian (Switzerland) is ¤ #,##0.00 and grouping separator is ', hence SFr. 1'234.56 is valid while 1,234.56 SFr. is invalid In order to check the validity of a currency amount before actually parsing it, Apache Commons Validator project's org.apache.commons.validator.routines.CurrencyValidator comes in real handy. All you have to do is to construct its object and call public boolean isValid(String value,Locale locale) method to check whether locale-specific format is followed or not. Once you are done with validation and found that number is valid, you can then parse the number by calling the parse() method. Following code snippet shows how to validate and parse. Note that the validation is lenient and if currency symbol is not already present in the string, it appends appropriate currency symbol according to the pattern and then validate and parse it. /** * Converts given item price into a number based on given * currency code. * * It works generically for all currencies supported by Java * * * @param itemPrice currency amount * @param currencyCode 3-letter ISO country code * @return {@link String} containing stripped price or same as given price * if parsing failed or formatter couldn't be constructed * @author Muhammad Haris */ public static Double convertPrice(String itemPrice, String currencyCode) { Double itemPriceConverted = null; Locale currencyLocale = LocaleUtility .getLocaleAgainstCurrency(currencyCode); DecimalFormat currencyFormatter = getCurrencyFormatter(currencyLocale); if (currencyFormatter != null) { itemPrice = appendCurrencySymbol(itemPrice, currencyFormatter); try { Number number = currencyFormatter.parse(itemPrice); itemPriceConverted = number.doubleValue(); } catch (ParseException e) { LOG.error("Failed to parse currency: " + currencyCode + ", value: " + itemPrice + ". " + e.getMessage(), e); } } else { LOG.error("No appropriate formatter found for currency: " + currencyCode + ", value: " + itemPrice + ". "); } return itemPriceConverted; } /** * Gets currency formatter against given currency locale * * @param currencyCode * {@link String} containing 3 letter ISO currency code * @return {@link NumberFormat} object specialized for the currency or null * if it couldn't be composed * @author Muhammad Haris */ public static DecimalFormat getCurrencyFormatter(Locale currencyLocale) { if (currencyLocale != null) { return (DecimalFormat) NumberFormat .getCurrencyInstance(currencyLocale); } return null; } /** * Appends appropriate currency symbol to the given price using the pattern * defined in the given currency formatter * * @param itemPrice * {@link String} containing price of the item in locale specific * format * @param currencyFormatter * {@link DecimalFormat} object containing currency locale * specific formatting info * @author Muhammad Haris */ public static String appendCurrencySymbol(String itemPrice, DecimalFormat currencyFormatter) { String currencySymbol = currencyFormatter.getDecimalFormatSymbols() .getCurrencySymbol(); String pattern = currencyFormatter.toPattern(); if (!itemPrice.contains(currencySymbol)) { if (pattern.startsWith("¤ ")) { itemPrice = currencySymbol + " " + itemPrice; } else if (pattern.endsWith(" ¤")) { itemPrice = itemPrice + " " + currencySymbol; } else if (pattern.startsWith("¤")) { itemPrice = currencySymbol + itemPrice; } else if (pattern.endsWith("¤")) { itemPrice = itemPrice + currencySymbol; } } return itemPrice; }

April 10, 2015

by Muhammad Haris

· 14,479 Views

Parse an XML Response with Java and Dom4J

Previously we’ve explored how to parse XML data using NodeJS as well as PHP. Continuing on the trend of parsing data using various programming languages, this time we’re going to take a look at parsing XML data using the dom4j library with Java. Now dom4j, is not the only way to parse XML data in Java. There are many other ways including using the SAX parser. Everyone will have their own opinions on which of the many to use. To keep up with my previous two XML tutorials, we’re going to use the following XML data saved in a file called data.xml at the root of the project: Code Blog Nic Raboy Nic Raboy Maria Campos With our XML content figured out, let’s make sure we structure our project like the following: project root src xmlparser MainDriver.java libs dom4j-1.6.1.jar build.xml data.xml Based on our project structure, you can probably tell that we’re going to be using Apache Ant for building. Say what you want about using Ant, but I’m still one of many who still uses it. Feel free to make changes to Apache Maven or other to better meet your needs. We’re now ready to crack open our src/xmlparser/MainDriver.java to start adding our parse logic. package xmlparser; import java.io.*; import java.util.*; import org.dom4j.*; import org.dom4j.io.*; public class MainDriver { public static void main(String[] args) { } public static void printRecursive(Element element) { } public static Document readFile(String filename) throws Exception { } } To further explain our intentions, the readFile(String filename) function will load the data.xmlfile and return it as a Document object for further parsing. The printRecursive(Element element)function will iterate through each node of the XML and print it out if it contains text. All levels of the XML will be iterated through. So let’s start with readFile(String filename): public static Document readFile(String filename) throws Exception { SAXReader reader = new SAXReader(); Document document = reader.read(new File(filename)); return document; } Nothing really to the above code. In fact, I pulled most of it from the dom4j quick-start code. The printRecursive(Element element) function is where things get more complex: public static void printRecursive(Element element) { for(int i = 0, size = element.nodeCount(); i < size; i++) { Node node = element.node(i); if(node instanceof Element) { Element currentNode = (Element) node; if(currentNode.isTextOnly()) { System.out.println(currentNode.getText()); } printRecursive(currentNode); } } } Some of the above code was taken from the dom4j quick-start, but the rest is some custom work. We are basically looking at each node and trying to visit any available children. If none exist, bail out. We also only want to print if there is text. Finally, we’re looking at the main(String[] args) function to bring it all together: public static void main(String[] args) { try { Element root = readFile("data.xml").getRootElement(); printRecursive(root); } catch (Exception e) { e.printStackTrace(); } } Just like that we’ve printed our each node of our XML document. In case you’re interested in the build.xml code, it can be seen below: To test the project you’d just run ant buildandrun from your command prompt or Terminal. Assuming of course you have Apache Ant configured correctly. The dom4j library is very thorough so I recommend have a look at the Javadocs that go with it.

April 7, 2015

by Nic Raboy

· 14,061 Views

To Shard, or Not to Shard

When I talk with customers about sharding decisions I often start by telling the following true story… A couple of years ago, a customer came to me looking for advice on how to shard his system. He told me he was already convinced he needed to do that since he read that some smart people at MySQL giants like Facebook and Twitter were sharding—so naturally this was something he should be doing, too. I paused for a moment and then I asked him what the size of his database was. “10GB,” he said. I nodded and asked if he handles many queries or if they were very complicated. “No,” he said. “Just a few hundred queries per second, and they have not been loading down the system by more than a few percent.” I asked him whether he was expecting exponential growth in the near future—looking to double every week or something like that. “No, our load and data size grew about 7 percent last year and we expect about the same growth this year and for the foreseeable future.” My recommendation to him was not to waste time and effort on sharding because it is just not needed in his company’s case. Before you decide how to shard, you’d best understand whether or not you really need to shard to begin with. Yes, on the extremely large-scale side of database demands, sharding is the only game in town. And not just for MySQL, but for pretty much any technology out there. Yet thanks to emerging technologies there is an increasing amount of applications that can run databases without sharding. Today we can easily run with terabytes of data per MySQL instance and serve tens of thousands of queries in many OLTP environments. This allows organizations to build very large applications without needing to shard. And keep this in mind: Sharding is a pain under all circumstances. Even if you have sharding provided out of the box by the database system, it is a pain because it introduces more components and complexity. Creating good distributed query execution plans is a very complicated task that needs to take network topology and load into account in addition to the data distribution and load of individual nodes. Before you decide if you need to shard, you should look at alternatives to scale your application. In the MySQL world, the solutions are typically as follows: Alternatives to Sharding Functional Partitioning: In many environments a single MySQL instance becomes a dumping ground for all kinds of databases—you might end up having your main application share a database instance with Drupal, which powers your website, with WordPress, which powers your blog, and with vBulletin, which powers your forums. Splitting those pieces into different database instances is something you should consider before you look into sharding. Custom-made systems will often have many applications using different data sets that can be easily split out. Replication: Many applications are read-heavy, so scaling reads becomes the issue earlier than it does with scaling writes. Replication is a great solution for this. MySQL’s built-in replication is very robust, though due to its asynchronous nature it adds complexity to the application. The developer must decide which of the reads can be done from the replica servers and which can’t, because you must be absolutely certain that you’re reading the most recent, actual data. This is the reason that alternative, synchronous replication technologies for MySQL like Percona XtraDB Cluster, are gaining popularity: They provide single database-like behavior from the cluster in most cases. Caching and Queueing: Caching is a great technology for reducing the amount of reads that hit the database. There are many applications that have reduced read load on the database by 80-95% using this technology. Queueing, in contrast, optimizes writes. It does this by merging multiple write operations together so they hit the database efficiently. Most large-scale applications should rely heavily on both of these technologies. Memcached and Redis are two popular caching technologies in the MySQL space. For queueing, the most popular technologies are ActiveMQ and RabbitMQ [1]. Supplemental Technologies: MySQL is great at many things but not at everything. If you’re looking for high-performance full-text search, consider ElasticSearch, Sphinx, or Lucene. If you’re looking at large-scale data analytics, a Hadoop-based infrastructure or Vertica might work well for you. You should let MySQL handle the things it is good at, and leave the rest to supporting tools. Optimizations to Make Before Sharding Scaling isn’t just about architecture either. You also need to make sure your system is reasonably optimized. Many people decide sharding is inevitable for them even though there are much easier and more cost-effective ways to get the performance and scale they are looking for. All of which, I might add, are also going to be valuable if sharding is indeed eventually needed. Hardware: Are you using the right hardware? I’ve seen many people looking into sharding when in fact simply purchasing decent hardware would solve their problems for years to come. Make sure you have plenty of memory and high-performance flash storage if you’re working with a large database. In many cases it can transform your system so much it will look like magic. MySQL version and Configuration: Use a recent MySQL version. By that I mean the latest GA version (MySQL 5.6 at the time of this article’s publication). Percona Server, which is free, often offers additional performance improvements for demanding workloads. Use the most recent operating system too, especially if you’re using modern hardware. Finally, make sure MySQL is configured properly. The difference in MySQL performance between poorly configured MySQL and well-tuned MySQL can be 10x or more. Schema and Queries: The same application logic can be expressed using a variety of schema and queries. I’ve seen a lot of similar applications approaching things differently, and the difference in the performance between an optimal approach and a poor one (but still used in production) can be 100x or more. Many of the changes can be retrofitted to existing schema—such as minor query changes and changes to the index structure—however, if your schema doesn’t fit your application needs well, then you might be looking at a complete redesign. So it is a good idea to think things through early. When to Shard So when should you start thinking about sharding? Basically, if none of the measures listed above have given you the performance you need, it might be time to consider sharding. Sharding does have the advantage of allowing you to potentially use lower-cost hardware or cheaper cloud instances. Most developers are using agile development methods these days and there is a common term, “Architectural Runway,” which defines how far the application can go with its current architecture. If you’ve already found success using replication in particular, it might be a bad decision to add sharding because it will force your developers to deal with the complexity of sharding and asynchronous replication. However, replication is still typically used to achieve high availability even if you’re already sharding, but in this case it’s not for scaling reads. If you’ve come to the point where you’re sure you need to shard, here are some of the questions you need to ask about how you’ll implement your sharding strategy: Shard Level: At which level should we shard? It does not have to be at the database level. Many applications, SaaS in particular, often “shard” on higher levels, deploying multiple copies of their full stack to offer complete isolation for availability, performance, security etc. In many large scale applications you will see multiple copies of a full stack deployed, each having its own sharded MySQL environment. Shard Key: How do we shard? In many cases the choice depends on whether you’re authenticating for user accounts or your organization, but in other cases it is not so obvious. When making a sharding choice, you need to think about two things: 1) as many data access points as possible should go into a single shard, because cross-shard access is expensive if supported at all, and 2) making sure such sharding does not produce a shard that is too large to handle either in terms of data size or traffic. For example, sharding by country is a poor idea because the requirements to handle Belgium traffic won’t be the same for the United States or China, which require a lot more resources. Shard by Schema or Instance: What is the unit of your shard? The typical choices are MySQL instance or database (schema). I like the shard = database approach, which doesn’t limit you to a single MySQL instance per physical box. That way you do not have to run too many MySQL instances, but you can run more than one if the application works better that way. Shard Unit: If you shard by a single MySQL server, you will run into a problem with high availability very soon. When you have 100 MySQL servers there are roughly 100 more chances for one of them to crash compared with having only one, so ensuring there is a high availability solution becomes critical. Instead of sharing across MySQL servers you will usually be sharding across “Replication Clusters,” such as one MySQL primary node and one or several replica or PXC (Percona XtraDB Cluster) nodes. Shard Technology: What technology can you use to assist you with sharding? Within the MySQL world there is no standard sharding technology as of yet that everyone uses. Most of the large web properties have implemented something in-house for their sharding needs, and some have released their solutions as open source projects. One example is Vitess, contributed by Google, and another is JetPants, contributed by Tumblr. Rolling out your own simple sharding framework might look easy for some developers until you have to deal with operational issues like balancing the shards, resharding, etc., on a large scale. There are a number of purpose-built technologies that can help you with sharding if this doesn’t sound like something your team can manage. Sharding Technologies Here are technologies that you should consider: MySQL Fabric: This is the sharding technology being developed by the MySQL team at Oracle. MySQL Fabric is GA, but its functionality right now is rather limited, especially in terms of their support for multi-sharded queries. Given more time however, it has the potential to become the standard sharding technology for MySQL. Tesora: Tesora has a proxy-based solution for MySQL sharding that became open source some time ago. I would be especially looking at Tesora if you’re also looking at deploying OpenStack, as they’ve invested a lot into the integration. ScaleArc: ScaleArc is a commercial database proxy solution that can do caching, filtering, routing, and sharding. It is a pretty mature solution that handles multiple database technologies and not just MySQL. ScaleBase: ScaleBase is a sharding solution designed specifically for MySQL and the cloud, which similarly to MySQL, operates at the proxy level. There are many technologies in the MySQL space that can help you scale your application without sharding. If you’re going to build the next “Facebook,” however, you will surely need to shard, and there are a number of technologies that can help you do it as painlessly as possible. Large-scale applications on large-scale databases will always introduce complexity, which makes them more complicated to develop against and manage. Success comes with cost. [1] http://dzone.com/research/guide-to-enterprise-integration Peter Zaitsev co-founded Percona in 2006, assuming the role of CEO. Percona helps companies of all sizes maximize their success with MySQL. Peter enjoys mixing business leadership with hands on technical expertise. Peter is also the co-author of O’Reilly’s High Performance MySQL, one of the most popular books on MySQL performance.

April 2, 2015

by Peter Zaitsev

· 20,246 Views · 1 Like

Dismantling invokedynamic

Many Java developers regarded the JDK's version seven release as somewhat a disappointment. On the surface, merely a few language and library extensions made it into the release, namely Project Coin and NIO2. But under the covers, the seventh version of the platform shipped the single biggest extension to the JVM's type system ever introduced after its initial release. Adding the invokedynamic instruction did not only lay the foundation for implementing lambda expressions in Java 8, it also was a game changer for translating dynamic languages into the Java byte code format. While the invokedynamic instruction is an implementation detail for executing a language on the Java virtual machine, understanding the functioning of this instruction gives true insights into the inner workings of executing a Java program. This article gives a beginner's view on what problem the invokedynamic instruction solves and how it solves it. Method handles Method handles are often described as a retrofitted version of Java's reflection API, but this is not what they are meant to represent. While method handles can represent a method, constructor or field, they are not intended to describe properties of these class members. It is for example not possible to directly extract metadata from a method handle such as modifiers or annotation values of the represented method. And while method handles allow for the invocation of a referenced method, their main purpose is to be used together with an invokedynamic call site. For gaining a better understanding of method handles, looking at them as an imperfect replacement for the reflection API is however a reasonable starting point. Method handles cannot be instantiated. Instead, method handles are created by using a designated lookup object. These objects are themselves created by using a factory method that is provided by the MethodHandles class. Whenever this factory is invoked, it first creates a security context which ensures that the resulting lookup object can only locate methods that are also visible to the class from which the factory method was invoked. A lookup object can then be created as follows: class Example { void doSomething() { MethodHandles.Lookup lookup = MethodHandles.lookup(); } private void foo() { /* ... */ } } As argued before, the above lookup object could only be used to locate methods that are also visible to the Example class such asfoo. It would for example be impossible to look up a private method of another class. This is a first major difference to using the reflection API where private methods of outside classes can be located just as any other method and where these methods can even be invoked after marking such a method as accessible. Method handles are therefore sensible of their creation context which is a first major difference to the reflection API. Apart from that, a method handle is more specific than the reflection API by describing a specific type of method rather than representing just any method. In a Java program, a method's type is a composite of both the method's return type and the types of its parameters. For example, the only method of the following Counter class returns an int representing the number of characters of the only String-typed argument: class Counter { static int count(String name) { return name.length(); } } A representation of this method's type can be created by using another factory. This factory is found in the MethodType class which also represents instances of created method types. Using this factory, the method type for Counter::count can be created by handing over the method's return type and its parameter types bundled as an array: MethodType methodType = MethodType.methodType(int.class, new Class[] {String.class}); By using the lookup object that was created before and the above method type, it is now possible to locate a method handle that represents the Counter::count method as depicted in the following code: MethodType methodType = MethodType.methodType(int.class, new Class[] {String.class}); MethodHandles.Lookup lookup = MethodHandles.lookup(); MethodHandle methodHandle = lookup.findStatic(Counter.class, "count", methodType); int count = methodHandle.invokeExact("foo"); assertThat(count, is(3)); At first glance, using a method handle might seem like an overly complex version of using the reflection API. However, keep in mind that the direct invocation of a method using a handle is not the main intent of its use. The main difference of the above example code and of invoking a method via the reflection API is only revealed when looking into the differences of how the Java compiler translates both invocations into Java byte code. When a Java program invokes a method, this method is uniquely identified by its name and by its (non-generic) parameter types and even by its return type. It is for this reason that it is possible to overload methods in Java. And even though the Java programming language does not allow it, the JVM does in theory allow to overload a method by its return type. Following this principle, a reflective method call is executed as a common method call of the Method::invoke method. This method is identified by its two parameters which are of the types Object and Object[]. In addition to this, the method is identified by its Object return type. Because of this signature, all arguments to this method need to always be boxed and enclosed in an array. Similarly, the return value needs to be boxed if it was primitive or null is returned if the method was void. Method handles are the exception to this rule. Instead of invoking a method handle by referring to the signature ofMethodHandle::invokeExact signature which takes an Object[] as its single argument and returns Object, method handles are invoked by using a so-called polymorphic signature. A polymorphic signature is created by the Java compiler dependant on the types of the actual arguments and the expected return type at a call site. For example, when invoking the method handle as above with int count = methodHandle.invokeExact("foo"); the Java compiler translates this invocation as if the invokeExact method was defined to accept a single single argument of typeString and returning an int type. Obviously, such a method does not exist and for (almost) any other method, this would result in a linkage error at runtime. For method handles, the Java Virtual Machine does however recognize this signature to be polymorphic and treats the invocation of the method handle as if the Counter::count method that the handle refers to was inset directly into the call site. Thus, the method can be invoked without the overhead of boxing primitive values or the return type and without placing the argument values inside an array. At the same time, when using the invokeExact invocation, it is guaranteed to the Java virtual machine that the method handle always references a method at runtime that is compatible to the polymorphic signature. For the example, the JVM expected that the referenced method actually accepts a String as its only argument and that it returns a primitive int. If this constraint was not fulfilled, the execution would instead result in a runtime error. However, any other method that accepts a single String and that returns a primitive int could be successfully filled into the method handle's call site to replace Counter::count. In contrast, using the Counter::count method handle at the following three invocations would result in runtime errors, even though the code compiles successfully: int count1 = methodHandle.invokeExact((Object) "foo"); int count2 = (Integer) methodHandle.invokeExact("foo"); methodHandle.invokeExact("foo"); The first statement results in an error because the argument that is handed to the handle is too general. While the JVM expected a String as an argument to the method, the Java compiler suggested that the argument would be an Object type. It is important to understand that the Java compiler took the casting as a hint for creating a different polymorphic signature with anObject type as a single parameter type while the JVM expected a String at runtime. Note that this restriction also holds for handing too specific arguments, for example when casting an argument to an Integer where the method handle required aNumber type as its argument. In the second statement, the Java compiler suggested to the runtime that the handle's method would return an Integer wrapper type instead of the primitive int. And without suggesting a return type at all in the third statement, the Java compiler implicitly translated the invocation into a void method call. Hence, invokeExact really does mean exact. This restriction can sometimes be too harsh. For this reason, instead of requiring an exact invocation, the method handle also allows for a more forgiving invocation where conversions such as type castings and boxings are applied. This sort of invocation can be applied by using the MethodHandle::invoke method. Using this method, the Java compiler still creates a polymorphic signature. This time, the Java virtual machine does however test the actual arguments and the return type for compatibility at run time and converts them by applying boxings or castings, if appropriate. Obviously, these transformations can sometimes add a runtime overhead. Fields, methods and constructors: handles as a unified interface Other than Method instances of the reflection API, method handles can equally reference fields or constructors. The name of theMethodHandle type could therefore be seen as too narrow. Effectively, it does not matter what class member is referenced via a method handle at runtime as long as its MethodType, another type with a misleading name, matches the arguments that are passed at the associated call site. Using the appropriate factories of a MethodHandles.Lookup object, a field can be looked up to represent a getter or a setter. Using getters or setters in this context does not refer to invoking an actual method that follows the Java bean specification. Instead, the field-based method handle directly reads from or writes to the field but in shape of a method call via invoking the method handle. By representing such field access via method handles, field access or method invocations can be used interchangeably. As an example for such interchange, take the following class: class Bean { String value; void print(String x) { System.out.println(x); } } For the above Bean class, the following method handles can be used for either writing a string to the value field or for invoking the print method with the same string as an argument: MethodHandle fieldHandle = lookup.findSetter(Bean.class, "value", String.class); MethodType methodType = MethodType.methodType(void.class, new Class[] {String.class}); MethodHandle methodHandle = lookup.findVirtual(Bean.class, "print", methodType); As long as the method handle call site is handed an instance of Bean together with a String while returning void, both method handles could be used interchangeably as shown here: anyHandle.invokeExact((Bean) mybean, (String) myString); Note that the polymorphic signature of the above call site does not match the method type of the above handle. However, within Java byte code, non-static methods are invoked as if they were static methods with where the this reference is handed as a first, implicit argument. A non-static method's nominal type does therefore diverge from its actual runtime type. Similarly, access to a non-static field requires an instance to be access. Similarly to fields and methods, it is possible to locate and invoke constructors which are considered as methods with a voidreturn value for their nominal type. Furthermore, one can not only invoke a method directly but even invoke a super method as long as this super method is reachable for the class from which the lookup factory was created. In contrast, invoking a super method is not possible at all when relying on the reflection API. If required, it is even possible to return a constant value from a handle. Performance metrics Method handles are often described as being a more performant as the Java reflection API. At least for recent releases of the HotSpot virtual machine, this is not true. The simplest way of proving this is writing an appropriate benchmark. Then again, is not all too simple to write a benchmark for a Java program which is optimized while it is executed. The de facto standard for writing a benchmark has become using JMH, a harness that ships under the OpenJDK umbrella. The full benchmark can be found as a gist in my GitHub profile. In this article, only the most important aspects of this benchmark are covered. From the benchmark, it becomes obvious that reflection is already implemented quite efficiently. Modern JVMs know a concept named inflation where a frequently invoked reflective method call is replaced with runtime generated Java byte code. What remains is the overhead of applying the boxing for passing arguments and receiving a return values. These boxings can sometimes be eliminated by the JVM's Just-in-time compiler but this is not always possible. For this reason, using method handles can be more performant than using the reflection API if method calls involve a significant amount of primitive values. This does however require that the exact method signatures are already known at compile time such that the appropriate polymorphic signature can be created. For most use cases of the reflection API, this guarantee can however not be given because the invoked method's types are not known at compile time. In this case, using method handles does not offer any performance benefits and should not be used to replace it. Creating an invokedynamic call site Normally, invokedynamic call sites are created by the Java compiler only when it needs to translate a lambda expression into byte code. It is worthwhile to note that lambda expressions could have been implemented without invokedynamic call sites altogether, for example by converting them into anonymous inner classes. As a main difference to the suggested approach, using invokedynamic delays the creation of a similar class to runtime. We are looking into class creation in the next section. For now, bear however in mind that invokedynamic does not have anything to do with class creation, it only allows to delay the decision of how to dispatch a method until runtime. For a better understanding of invokedynamic call sites, it helps to create such call sites explicitly in order to look at the mechanic in isolation. To do so, the following example makes use of my code generation framework Byte Buddy which provides explicit byte code generation of invokedynamic call sites without requiring a any knowledge of the byte code format. Any invokedynamic call site eventually yields a MethodHandle that references the method to be invoked. Instead of invoking this method handle manually, it is however up to the Java runtime to do so. Because method handles have become a known concept to the Java virtual machine, these invocations are then optimized similarly to a common method call. Any such method handle is received from a so-called bootstrap method which is nothing more than a plain Java method that fulfills a specific signature. For a trivial example of a bootstrap method, look at the following code: class Bootstrapper { public static CallSite bootstrap(Object... args) throws Throwable { MethodType methodType = MethodType.methodType(int.class, new Class[] {String.class}) MethodHandles.Lookup lookup = MethodHandles.lookup(); MethodHandle methodHandle = lookup.findStatic(Counter.class, "count", methodType); return new ConstantCallSite(methodHandle); } } For now, we do not care much about the arguments of the method. Instead, notice that the method is static what is as a matter of fact a requirement. Within Java byte code, an invokedynamic call site references the full signature of a bootstrap method but not a specific object which could have a state and a life cycle. Once the invokedynamic call site is invoked, control flow is handed to the referenced bootstrap method which is now responsible for identifying a method handle. Once this method handle is returned from the bootstrap method, it is invoked by the Java runtime. As obvious from the above example, a MethodHandle is not returned directly from a bootstrap method. Instead, the handle is wrapped inside of a CallSite object. Whenever a bootstrap method is invoked, the invokedynamic call site is later permanently bound to the CallSite object that is returned from this method. Consequently, a bootstrap method is only invoked a single time for any call site. Thanks to this intermediate CallSite object, it is however possible to exchange the referenced MethodHandle at a later point. For this purpose, the Java class library already offers different implementations of CallSite. We have already seen a ConstantCallSite in the example code above. As the name suggests, a ConstantCallSite always references the same method handle without a possibility of a later exchange. Alternatively, it is however also possible to for example use aMutableCallSite which allows to change the referenced MethodHandle at a later point in time or it is even possible to implement a custom CallSite class. With the above bootstrap method and Byte Buddy, we can now implement a custom invokedynamic instruction. For this, Byte Buddy offers the InvokeDynamic instrumentation that accepts a bootstrap method as its only mandatory argument. Such instrumentations are then fed to Byte Buddy. Assuming the following class: abstract class Example { abstract int method(); } we can use Byte Buddy to subclass Example in order to override method. We are then going to implement this method to contain a single invokedynamic call site. Without any further configuration, Byte Buddy creates a polymorphic signature that resembles the method type of the overridden method. However, for non-static methods, the this reference is set as a first, implicit argument. Assuming that we want to bind the Counter::count method which expects a String as a single argument, we could not bind this handle to Example::method because of this type mismatch. Therefore, we need to create a different call site without the implicit argument but with an String in its place. This can be achieved by using Byte Buddy's domain specific language: Instrumentation invokeDynamic = InvokeDynamic .bootstrap(Bootstrapper.class.getDeclaredMethod(“bootstrap”, Object[].class)) .withoutImplicitArguments() .withValue("foo"); With this instrumentation in place, we can finally extend the Example class and override method to implement the invokedynamic call site as in the following code snippet: Example example = new ByteBuddy() .subclass(Example.class) .method(named(“method”)).intercept(invokeDynamic) .make() .load(Example.class.getClassLoader(), ClassLoadingStrategy.Default.INJECTION) .getLoaded() .newInstance(); int result = example.method(); assertThat(result, is(3)); As obvious from the above assertion, the characters of the "foo" string were counted correctly. By setting appropriate break points in the code, it is further possible to validate that the bootstrap method is called and that control flow further reaches theCounter::count method. So far, we did not gain much from using an invokedynamic call site. The above bootstrap method would always bindCounter::count and can therefore only produce a valid result if the invokedynamic call site really wanted to transform a Stringinto an int. Obviously, bootstrap methods can however be more flexible thanks to the arguments they receive from the invokedynamic call site. Any bootstrap method receives at least three arguments: As a first argument, the bootstrap method receives a MethodHandles.Lookup object. The security context of this object is that of the class that contains the invokedynamic call site that triggered the bootstrapping. As discussed before, this implies that private methods of the defining class could be bound to the invokedynamic call site using this lookup instance. The second argument is a String representing a method name. This string serves as a hint to indicate from the call site which method should be bound to it. Strictly speaking, this argument is not required as it is perfectly legal to bind a method with another name. Byte Buddy simply serves the the name of the overridden method as this argument, if not specified differently. Finally, the MethodType of the method handle that is expected to be returned is served as a third argument. For the example above, we specified explicitly that we expect a String as a single parameter. At the same time, Byte Buddy derived that we require an int as a return value from looking at the overridden method, as we again did not specify any explicit return type. It is up to the implementor of a bootstrap method what exact signature this method should portray as long as it can at least accept these three arguments. If the last parameter of a bootstrap method represents an Object array, this last parameter is treated as a varargs and can therefore accept any excess arguments. This is also the reason why the above example bootstrap method is valid. Additionally, a bootstrap method can receive several arguments from an invokedynamic call site as long as these arguments can be stored in a class's constant pool. For any Java class, a constant pool stores values that are used inside of a class, largely numbers or string values. As of today, such constants can be primitive values of at least 32 bit size, Strings, Classes,MethodHandles and MethodTypes. This allows bootstrap methods to be used more flexible, if locating a suitable method handle requires additional information in form of such arguments. Lambda expressions Whenever the Java compiler translates a lambda expression into byte code, it copies the lambda's body into a private method inside of the class in which the expression is defined. These methods are named lambda$X$Y with X being the name of the method that contains the lambda expression and with Y being a zero-based sequence number. The parameters of such a method are those of the functional interface that the lambda expression implements. Given that the lambda expression makes no use of non-static fields or methods of the enclosing class, the method is also defined to be static. For compensation, the lambda expression is itself substituted by an invokedynamic call site. On its invocation, this call site requests the binding of a factory for an instance of the functional interface. As arguments to this factory, the call site supplies any values of the lambda expression's enclosing method which are used inside of the expression and a reference to the enclosing instance, if required. As a return type, the factory is required to provide an instance of the functional interface. For bootstrapping a call site, any invokedynamic instruction currently delegates to the LambdaMetafactory class which is included in the Java class library. This factory is then responsible for creating a class that implements the functional interface and which invokes the appropriate method that contains the lambda's body which, as described before, is stored in the original class. In the future, this bootstrapping process might however change which is one of the major advantages of using invokedynamic for implementing lambda expressions. If one day, a better suited language feature was available for implementing lambda expressions, the current implementation could simply be swapped out. In order to being able to create a class that implements the functional interface, any call site representing a lambda expression provides additional arguments to the bootstrap method. For the obligatory arguments, it already provides the name of the functional interface's method. Also, it provides a MethodType of the factory method that the bootstrapping is supposed to yield as a result. Additionally, the bootstrap method is supplied another MethodType that describes the signature of the functional interface's method. To that, it receives a MethodHandle referencing the method that contains the lambda's method body. Finally, the call site provides a MethodType of the generic signature of the functional interface's method, i.e. the signature of the method at the call site before type-erasure was applied. When invoked, the bootstrap method looks at these arguments and creates an appropriate implementation of a class that implements the functional interface. This class is created using the ASM library, a low-level byte code parser and writer that has become the de facto standard for direct Java byte code manipulation. Besides implementing the functional interface's method, the bootstrap method also adds an appropriate constructor and a static factory method for creating instances of the class. It is this factory method that is later bound to the invokedyanmic call site. As arguments, the factory receives an instance to the lambda method's enclosing instance, in case it is accessed and also any values that are read from the enclosing method. As an example, consider the following lambda expression: class Foo { int i; void bar(int j) { Consumer consumer = k -> System.out.println(i + j + k); } } In order to be executed, the lambda expression requires access to both the enclosing instance of Foo and to the value j of its enclosing method. Therefore, the desugared version of the above class looks something like the following where the invokedynamic instruction is represented by some pseudo-code: class Foo { int i; void bar(int j) { Consumer consumer = ; } private /* non-static */ void lambda$foo$0(int j, int k) { System.out.println(this.i + j + k); } } In order to being able to invoke lambda$foo$0, both the enclosing Foo instance and the j variable are handed to the factory that is bound by the invokedyanmic instruction. This factory then receives the variables it requires in order to create an instance of the generated class. This generated class would then look something like the following: class Foo$$Lambda$0 implements Consumer { private final Foo _this; private final int j; private Foo$$Lambda$0(Foo _this, int j) { this._this = _this; this.j = j; } private static Consumer get$Lambda(Foo _this, int j) { return new Foo$$Lambda$0(_this, j); } public void accept(Object value) { // type erasure _this.lambda$foo$0(_this, j, (Integer) value); } } Eventually, the factory method of the generated class is bound to the invokedynamic call site via a method handle that is contained by a ConstantCallSite. However, if the lambda expression is fully stateless, i.e. it does not require access to the instance or method in which it is enclosed, the LambdaMetafactory returns a so-called constant method handle that references an eagerly created instance of the generated class. Hence, this instance serves as a singleton to be used for every time that the lambda expression's call site is reached. Obviously, this optimization decision affects your application's memory footprint and is something to keep in mind when writing lambda expressions. Also, no factory method is added to a class of a stateless lambda expression. You might have noticed that the lambda expression's method body is contained in a private method which is now invoked from another class. Normally, this would result in an illegal access error. To overcome this limitation, the generated classes are loaded using so-called anonymous class loading. Anonymous class loading can only be applied when a class is loaded explicitly by handing a byte array. Also, it is not normally possible to apply anonymous class loading in user code as it is hidden away in the internal classes of the Java class library. When a class is loaded using anonymous class loading, it receives a host class of which it inherits its full security context. This involves both method and field access rights and the protection domain such that a lambda expression can also be generated for signed jar files. Using this approch, lambda expression can be considered more secure than anonymous inner classes because private methods are never reachable from outside of a class. Under the covers: lambda forms Lambda forms are an implementation detail of how MethodHandles are executed by the virtual machine. Because of their name, lambda forms are however often confused with lambda expressions. Instead, lambda forms are inspired by lambda calculus and received their name for that reason, not for their actual usage to implement lambda expressions in the OpenJDK. In earlier versions of the OpenJDK 7, method handles could be executed in one of two modes. Method handles were either directly rendered as byte code or they were dispatched using explicit assembly code that was supplied by the Java runtime. The byte code rendering was applied to any method handle that was considered to be fully constant throughout the lifetime of a Java class. If the JVM could however not prove this property, the method handle was instead executed by dispatching it to the supplied assembly code. Unfortunately, because assembly code cannot be optimized by Java's JIT-compiler, this lead to non-constant method handle invocations to "fall off the performance cliff". As this also affected the lazily bound lambda expressions, this was obviously not a satisfactory solution. LambdaForms were introduced to solve this problem. Roughly speaking, lambda forms represent byte code instructions which, as stated before, can be optimized by a JIT-compiler. In the OpenJDK, a MethodHandle's invocation semantics are today represented by a LambdaForm to which the handle carries a reference. With this optimizable intermediate representation, the use of non-constant MethodHandles has become significantly more performant. As a matter of fact, it is even possible to see a byte-code compiled LambdaForm in action. Simply place a break point inside of a bootstrap method or inside of a method that is invoked via a MethodHandle. Once the break point kicks it, the byte code-translated LambdaForms can be found on the call stack. Why this matters for dynamic languages Any language that should be executed on the Java virtual machine needs to be translated to Java byte code. And as the name suggests, Java byte code aligns rather close to the Java programming language. This includes the requirement to define a strict type for any value and before invokedynamic was introduced, a method call required to specify an explicit target class for dispatching a method. Looking at the following JavaScript code, specifying either information is however not possible when translating the method into byte code: 1 2 3 function (foo) { foo.bar(); } Using an invokedynamic call site, it has become possible to delay the identification of the method's dispatcher until runtime and furthermore, to rebind the invocation target, in case that a previous decision needs to be corrected. Before, using the reflection API with all of its performance drawbacks was the only real alternative to implementing a dynamic language. The real profiteer of the invokedynamic instruction are therefore dynamic programming languages. Adding the instruction was a first step away from aligning the byte code format to the Java programming language, making the JVM a powerful runtime even for dynamic languages. And as lambda expressions proved, this stronger focus on hosting dynamic languages on the JVM does neither interfere with evolving the Java language. In contrast, the Java programming languages gained from these efforts.

April 2, 2015

by Rafael Winterhalter

· 13,004 Views · 7 Likes

How CAS (Compare And Swap) in Java works

Before we dig into CAS (Compare And Swap) strategy and how is it used by atomic constructs like AtomicInteger, first consider this code: public class MyApp { private volatile int count = 0; public void upateVisitors() { ++count; //increment the visitors count } } This sample code is tracking the count of visitors to the application. Is there anything wrong with this code? What will happen if multiple threads try to update count? Actually the problem is simply marking count as volatile does not guarantee atomicity and ++count is not an atomic operations. To read more check this. Can we solve this problem if we mark the method itself synchronized as shown below: public class MyApp { private int count = 0; public synchronized void upateVisitors() { ++count; //increment the visitors count } } Will this work? If yes then what changes have we made actually? Does this code guarantee atomicity? Yes. Does this code guarantee visibility? Yes. Then what is the problem? It makes use of locking and that introduces lot of delay and overhead. Check this article. This is very expensive way of making things work. To overcome these problems atomic constructs were introduced. If we make use of an AtomicInteger to track the count it will work. public class MyApp { private AtomicInteger count = new AtomicInteger(0); public void upateVisitors() { count.incrementAndGet(); //increment the visitors count } } The classes that support atomic operations e.g. AtomicInteger, AtomicLong etc. makes use of CAS. CAS does not make use of locking rather it is very optimistic in nature. It follows these steps: Compare the value of the primitive to the value we have got in hand. If the values do not match it means some thread in between has changed the value. Else it will go ahead and swap the value with new value. Check the following code in AtomicLong class: public final long incrementAndGet() { for (;;) { long current = get(); long next = current + 1; if (compareAndSet(current, next)) return next; } } In JDK 8 the above code has been changed to a single intrinsic: public final long incrementAndGet() { return unsafe.getAndAddLong(this, valueOffset, 1L) + 1L; } What advantage this single intrinsic have? Actually this single line is JVM intrinsic which is translated by JIT into an optimized instruction sequence. In case of x86 architecture it is just a single CPU instruction LOCK XADD which might yield better performance than classic load CAS loop. Now think about the possibility when we have high contention and a number of threads want to update the same atomic variable. In that case there is a possibility that locking will outperform the atomic variables but in realistic contention levels atomic variables outperform lock. There is one more construct introduced in Java 8, LongAdder. As per the documentation: This class is usually preferable to AtomicLong when multiple threads update a common sum that is used for purposes such as collecting statistics, not for fine-grained synchronization control. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption. So LongAdder is not always a replacement for AtomicLong. We need to consider the following aspects: When no contention is present AtomicLong performs better. LongAdder will allocate Cells (a final class declared in abstract class Striped64) to avoid contention which consumes memory. So in case we have a tight memory budget we should prefer AtomicLong. That's all folks. Hope you enjoyed it.

April 1, 2015

by Akhil Mittal

· 70,433 Views · 2 Likes

Fork/Join Framework vs. Parallel Streams vs. ExecutorService: The Ultimate Fork/Join Benchmark

How does the Fork/Join framework act under different configurations? Just like the upcoming episode of Star Wars, there has been a lot of excitement mixed with criticism around Java 8 parallelism. The syntactic sugar of parallel streams brought some hype almost like the new lightsaber we’ve seen in the trailer. With many ways now to do parallelism in Java, we wanted to get a sense of the performance benefits and the dangers of parallel processing. After over 260 test runs, some new insights rose from the data and we wanted to share these with you in this post. Fork/Join Framework vs. Parallel Streams vs. ExecutorService: The Ultimate Fork/Join Benchmark http://t.co/CMNfYZe58Z pic.twitter.com/6WExlmbyo6 — Takipi (@takipid) January 20, 2015 ExecutorService vs. Fork/Join Framework vs. Parallel Streams A long time ago, in a galaxy far, far away.... I mean, some 10 years ago concurrency was available in Java only through 3rd party libraries. Then came Java 5 and introduced the java.util.concurrent library as part of the language, strongly influenced by Doug Lea. The ExecutorService became available and provided us a straightforward way to handle thread pools. Of course java.util.concurrent keeps evolving and in Java 7 the Fork/Join framework was introduced, building on top of the ExecutorService thread pools. With Java 8 streams, we’ve been provided an easy way to use Fork/Join that remains a bit enigmatic for many developers. Let’s find out how they compare to one another. We’ve taken 2 tasks, one CPU-intensive and the other IO-intensive, and tested 4 different scenarios with the same basic functionality. Another important factor is the number of threads we use for each implementation, so we tested that as well. The machine we used had 8 cores available so we had variations of 4, 8, 16 and 32 threads to get a sense of the general direction the results are going. For each of the tasks, we’ve also tried a single threaded solution, which you’ll not see in the graphs since, well, it took much much longer to execute. To learn more about exactly how the tests ran you can check out the groundwork section below. Now, let’s get to it. Indexing a 6GB file with 5.8M lines of text In this test, we’ve generated a huge text file, and created similar implementations for the indexing procedure. Here’s what the results looked like: ** Single threaded execution: 176,267msec, or almost 3 minutes. ** Notice the graph starts at 20000 milliseconds. 1. Fewer threads will leave CPUs unutilized, too many will add overhead The first thing you notice in the graph is the shape the results are starting to take - you can get an impression of how each implementation behaves from only these 4 data points. The tipping point here is between 8 and 16 threads, since some threads are blocking in file IO, and adding more threads than cores helped utilize them better. When 32 threads are in, performance got worse because of the additional overhead. 2. Parallel Streams are the best! Almost 1 second better than the runner up: using Fork/Join directly Syntactic sugar aside (lambdas! we didn’t mention lambdas), we’ve seen parallel streams perform better than the Fork/Join and the ExecutorService implementations. 6GB of text indexed in 24.33 seconds. You can trust Java here to deliver the best result. 3. But… Parallel Streams also performed the worst: The only variation that went over 30 seconds This is another reminder of how parallel streams can slow you down. Let’s say this happens on machines that already run multithreaded applications. With a smaller number of threads available, using Fork/Join directly could actually be better than going through parallel streams - a 5 second difference, which makes for about an 18% penalty when comparing these 2 together. 4. Don’t go for the default pool size with IO in the picture When using the default pool size for Parallel Streams, the same number of cores on the machine (which is 8 here), performed almost 2 seconds worse than the 16 threads version. That’s a 7% penalty for going with the default pool size. The reason this happens is related with blocking IO threads. There’s more waiting going on, so introducing more threads lets us get more out of the CPU cores involved while other threads wait to be scheduled instead of being idle. How do you change the default Fork/Join pool size for parallel streams? You can either change the common Fork/Join pool size using a JVM argument: [java] -Djava.util.concurrent.ForkJoinPool.common.parallelism=16 [/java] (All Fork/Join tasks are using a common static pool the size of the number of your cores by default. The benefit here is reducing resource usage by reclaiming the threads for other tasks during periods of no use.) Or... You can use this trick and run Parallel Streams within a custom Fork/Join pool. This overrides the default use of the common Fork/Join pool and lets you use a pool you’ve set up yourself. Pretty sneaky. In the tests, we’ve used the common pool. 5. Single threaded performance was 7.25x worse than the best result Parallelism provided a 7.25x improvement, and considering the machine had 8 cores, it got pretty close to the theoretic 8x prediction! We can attribute the rest to overhead. With that being said, even the slowest parallelism implementation we tested, which this time was parallel streams with 4 threads (30.24sec), performed 5.8x better than the single threaded solution (176.27sec). What happens when you take IO out of the equation? Checking if a number is prime For the next round of tests, we’ve eliminated IO altogether and examined how long it would take to determine if some really big number is prime or not. How big? 19 digits. 1,530,692,068,127,007,263, or in other words: one quintillion seventy nine quadrillion three hundred sixty four trillion thirty eight billion forty eight million three hundred five thousand thirty three. Argh, let me get some air. Anyhow, we haven’t used any optimization other than running to its square root, so we checked all even numbers even though our big number doesn’t divide by 2 just to make it process longer. Spoiler alert: it’s a prime, so each implementation ran the same number of calculations. Here’s how it turned out: ** Single threaded execution: 118,127msec, or almost 2 minutes. ** Notice the graph starts at 20000 milliseconds 1. Smaller differences between 8 and 16 threads Unlike the IO test, we don’t have IO calls here so the performance of 8 and 16 threads was mostly similar, except for the Fork/Join solution. We’ve actually ran a few more sets of tests to make sure we’re getting good results here because of this “anomaly” but it turned out very similar time after time. We’d be glad to hear your thoughts about this in the comment section below. 2. The best results are similar for all methods We see that all implementations share a similar best result of around 28 seconds. No matter which way we tried to approach it, the results came out the same. This doesn’t mean that we’re indifferent to which method to use. Check out the next insight. 3. Parallel streams handle the thread overload better than other implementations This is the more interesting part. With this test, we see again that the the top results for running 16 threads are coming from using parallel streams. Moreover, in this version, using parallel streams was a good call for all variations of thread numbers. 4. Single threaded performance was 4.2x worse than the best result In addition, the benefit of using parallelism when running computationally intensive tasks is almost 2 times worse than the IO test with file IO. This makes sense since it’s a CPU intensive test, unlike the previous one where we could get an extra benefit from cutting down the time our cores were waiting on threads stuck with IO. Conclusion I’d recommend going to the source to learn more about when to use parallel streams and applying careful judgement anytime you do parallelism in Java. The best path to take would be running similar tests to these in a staging environment where you can try and get a better sense of what you’re up against. The factors you have to be mindful of are of course the hardware you’re running on (and the hardware you’re testing on), and the total number of threads in your application. This includes the common Fork/Join pool and code other developers on your team are working on. So try to keep those in check and get a full view of your application before adding parallelism of your own. Groundwork To run this test we’ve used an EC2 c3.2xlarge instance with 8 vCPUs and 15GB of RAM. A vCPU means there’s hyperthreading in place so in fact we have here 4 physical cores that each act as if it were 2. As far as the OS scheduler is concerned, we have 8 cores here. To try and make it as fair as we could, each implementation ran 10 times and we’ve taken the average run time of runs 2 through 9. That’s 260 test runs, phew! Another thing that was important is the processing time. We’ve chosen tasks that would take well over 20 seconds to process so the differences will be easier to spot and less affected by external factors. What’s next? The raw results are available right here, and the code is on GitHub. Please feel free to tinker around with it and let us know what kind of results you’re getting. If you have any more interesting insights or explanations for the results that we’ve missed, we’d be happy to read them and add it to the post. Originally posted on Takipi's blog

April 1, 2015

by Chen Harel

· 15,980 Views

Get Client (Browser) timezone and maintain it in cookie

Recently, I came with requirement where we need to get browser timezone and maintain it so our Spring MVC application can use it. Our application need to convert date and time from server timezone to client timezone. Below is overall idea of implementation: Get Browser timezone by javascript. We can use opensource 'jstz.min.js' file for getting this. We can find this from ‘http://pellepim.bitbucket.org/jstz/’. We need to maintain this timezone. For same, we will store this timezone in cookie. This can be done by creating one jsp 'findTimeZonePage.jsp'. This page will store timezone in cookie and again redirect to original page. Every method of Spring MVC controller will check whether cookie is available, If not then it will redirect to findTimeZonePage.jsp. While doing this we will also pass current Url(will set in model) so that findTimeZonePage jsp can redirect to same page again. Code: 1. findTimeZonePage.jsp loading the page... 2. Add below Methods in Util class: public static TimeZonegetBrowserTimeZone(HttpServletRequest request){ Cookie[] cookieArray = request.getCookies(); if(cookieArray != null){ for(Cookie cookie : cookieArray){ if("CalenderAppTimeZone".equals(cookie.getName())){ String timeZoneId = cookie.getValue(); return TimeZone.getTimeZone(timeZoneId); } } } return null; } public static StringgetFullURL(HttpServletRequest request) { StringBuffer requestURL = request.getRequestURL(); String queryString = request.getQueryString(); if (queryString == null) { return requestURL.toString(); } else { return requestURL.append('?').append(queryString).toString(); } } 3. In each method of MVC Controller class, Add below code at start of method: TimeZone currentTimeZone = MyUtil.getBrowserTimeZone(request); if(currentTimeZone == null){ String url = MyUtil.getFullURL(request); System.out.println("Url="+url); model.addAttribute("redirectUrl", url); //Redirect to 'findTimeZone' for setting timezone. System.out.println("####Timezone is not set. Redirecting to findTimeZone.jsp for setting timezone."); return "findTimeZonePage"; } System.out.println("####Current TimeZone="+currentTimeZone.getID()); Hope this will help.

March 28, 2015

by Rajeshkumar Dave

· 12,231 Views

6 Python Performance Tips

[this post was written by john paul mueller] python is such a cool language because you can do so much with it in such a short time with so little code. not only that, it supports many tasks, such as multiprocessing, with ease. python detractors sometimes claim python is slow. but it doesn’t have to be that way: try these six tips to speed up your python applications. 1. rely on an external package for critical code python makes many programming tasks easy, but it may not always provide the best performance with time-critical tasks. using a c, c++, or machine language external package for time-critical tasks can improve application performance. these packages are platform-specific, which means that you need the appropriate package for the platform you’re using. in short, this solution gives up some application portability in exchange for performance that you can obtain only by programming directly to the underlying host. here are some packages you should consider adding to your performance arsenal: cython pyinlne pypy pyrex the packages act in different ways. for example, pyrex makes it possible to extend python to do things like use c data types to make memory tasks more efficient or straightforward. pyinline lets you use c code directly in your python application. the inline code is compiled separately, but it keeps everything in one place while making use of the efficiencies that c can provide. 2. use keys for sorts there is a lot of really old python sorting code out there that will cost you time in creating a custom sort and speed in actually performing the sort during runtime. the best way to sort items is to use keys and the default sort() method whenever possible. for example, consider the following code: import operator somelist = [(1, 5, 8), (6, 2, 4), (9, 7, 5)] somelist.sort(key=operator.itemgetter(0)) somelist #output = [(1, 5, 8), (6, 2, 4), (9, 7, 5)] somelist.sort(key=operator.itemgetter(1)) somelist #output = [(6, 2, 4), (1, 5, 8), (9, 7, 5)] somelist.sort(key=operator.itemgetter(2)) somelist #output = [(6, 2, 4), (9, 7, 5), (1, 5, 8)], in each case the list is sorted according to the index you select as part of the key argument. this approach works just as well with strings as it does with numbers. 3. optimizing loops every programming language emphasizes the need to optimize loops. when working with python, you can rely on a wealth of techniques for making loops run faster. however, one method developers often miss is to avoid the use of dots within a loop. for example, consider the following code: lowerlist = ['this', 'is', 'lowercase'] upper = str.upper upperlist = [] append = upperlist.append for word in lowerlist: append(upper(word)) print(upperlist) #output = ['this', 'is', 'lowercase'] every time you make a call to str.upper, python evaluates the method. however, if you place the evaluation in a variable, the value is already known and python can perform tasks faster. the point is to reduce the amount of work that python performs within loops because the interpreted nature of python can really slow it down in those instances. ( note: there are many ways to optimize loops; this is only one of them. for example, many programmers would say that list comprehension is the best way to achieve speed benefits in loops. the key is that optimizing loops is one of the better way to achieve higher application speed.) 4. use a newer version anyone who searches python information online will find countless messages asking about moving from one version of python to another. in general, every version of python included optimizations that make it faster than the previous version. the limiting factor is whether your favorite libraries have also made the move to the newer version of python. rather than asking whether the move should be made, the key question is determine when a new version has sufficient support to make a move viable. you need to verify that your code still runs. you need to use the new libraries you obtained to use with the new version of python and then check your application for breaking changes. only after you make the required corrections will you notice any difference. however, if you just ensure your application runs with the new version, you could miss out on new features found in the update. once you make the move, profile your application under the new version, check for problem areas, and then update those areas to use new version features first. users will see a larger performance gain earlier in the upgrade process. 5. try multiple coding approaches using precisely the same coding approach every time you create an application will almost certainly result in some situations where the application runs slower than it might. try a little experimentation as part of the profiling process. for example, when managing items in a dictionary, you can take the safe approach of determining whether the item already exists and update it or you can add the item directly and then handle the situation where the item doesn’t exist as an exception. consider this first coding example: n = 16 mydict = {} for i in range(0, n): char = 'abcd'[i%4] if char not in mydict: mydict[char] = 0 mydict[char] += 1 print(mydict) this code will generally run faster when mydict is empty to start with. however, when mydict is usually filled (or at least mostly filled) with data, an alternative approach works better. n = 16 mydict = {} for i in range(0, n): char = 'abcd'[i%4] try: mydict[char] += 1 except keyerror: mydict[char] = 1 print(mydict) the output of {'d': 4, 'c': 4, 'b': 4, 'a': 4} is the same in both cases. the only difference is how the output is obtained. thinking outside the box and creating new coding techniques can help you obtain faster results with your applications. 6. cross-compile your application developers sometimes forget that computers don’t actually speak any of the languages used to create modern applications. computers speak machine code. in order to run the application, you use an application to convert the human readable code you use into something the computer can understand. there are times when writing an application in one language, such as python, and running it in another language, such as c++, makes sense from a performance perspective. it depends on what you want the application to do and the resources that the host system can provide. one interesting cross-compiler, nuitka , converts your python code into c++ code. the result is that you can execute the application in native mode instead of relying on an interpreter. depending on the platform and task, you could see a significant performance increase. ( note: nuitka is currently in beta, so use it with care on production applications. in fact, it’s best used for experimentation right now. there is also some discussion as to whether cross-compilation is the best way to achieve better performance. developers have used cross-compilation for years to achieve specific goals, such as better application speed. just remember that every solution comes with trade-offs and you should consider them before using the solution in a production environment.) when working with a cross-compiler, be sure it supports the version of python you work with. nuitka supports python 2.6, 2.7, 3.2, and 3.3. to make this solution work, you need both a python interpreter and a c++ compiler. nuitka supports a number of c++ compilers, including microsoft visual studio , mingw , and clang/llvm . cross-compilation can bring some serious downsides. for example, when working with nuitka, you find that even a small program can consume major drive space because nuitka implements python functionality using a number of dynamic link libraries (dlls). so this solution may not work well if you’re dealing with a resource-constrained system. bottom line each of the six tips in this article can help you create faster python applications. but there are no silver bullets. none of the tips will work every time. some work better than others with specific versions of python—even the platform can make a difference. you need to profile your application to determine where it works slowly and then try the tips that appear to best address those issues.

March 27, 2015

by Fredric Paul

· 8,062 Views