Java vs. Go Microservices - Load testing (Rematch)
Join the DZone community and get the full member experience.
Join For FreeWhen Go first appeared in November 2009, we didn't hear much about it, and our first interaction happened in 2012 when Go version 1 was officially released by Google. Our team decided to convince our customer to use it for its project, but it was a hard sell, and the customer rejected our recommendation (mostly due to lack of the knowledge in their support team ).
Lots of changes have happened since that time: popularization of Docker, microservice architecture, Go maturing more as a programming language (without any changes to its syntax). So, my brother and I decided to take another look at Go, and our journey began. We started reading official documentation, tutorials, blog posts and articles about Go, especially ones where authors shared their experiences of migration from Java to Go or comparison Java with Go, as at that moment, we'd been using Java for 15+ years.
One of the articles that we came across compared Java and Go for microservice implementation and which could serve more users using similar hardware. Having a long experience in Java and knowing its strong and weak points and also having couple months experience in Go but already knowing its virtues (compiled language, goroutines != threads), we were expecting to see very close results with a smaller footprint (CPU/memory usage) for Go.
So, we were caught by surprise to see that Go started falling behind on 2k concurrent users. Since we decided to invest our time in learning Go, it was very important for us to investigate what caused such results. In this article, we'll explain how we investigated the issue and how we tested our changes.
You may also like: Golang Tutorial: Learn Golang by Examples.
A Brief Description of the Original Experiment
The author created a bank service with 3 APIs:
POST /client/new/{balance} — create a new client with an initial balance.
POST /transaction — moves money from one account to another.
GET /client/{id}/balance — returns current balance for the client.
and implemented it using Java and Go. For persistent storage, the author uses PostgreSQL. To test the services, the author created a jmeter scenario and ran it with different sets of concurrent users from 1k to 10k.
All this was run on AWS.
Java |
Go |
|||
Number of users |
Response time (sec) |
Errors (%) |
Response time (sec) |
Errors (%) |
... |
... |
... |
... |
... |
4k |
5.96 |
2.63% |
14.20 |
6.62% |
... |
... |
... |
... |
... |
10k |
42.59 |
16.03% |
46.50 |
39.30% |
TLDR;
The root cause of the problem was the limited amount of the available connections on Postgres (default 100 connections) and an improper use of SQL DB objects. After fixing those issues, both services showed similar results, and the only difference was a bigger CPU and memory footprint in Java (everything's bigger in Java).
Let's Dive In...
We decided to start analyzing why the error rate was so high in the Go version of the service. To do so, we added logging to the original code and ran load tests by ourselves. After log analysis, we noticed that all the errors were due to a problem with opening connections to the database.
After looking into the code, the first thing that came to our attention was that for each API call it created a new sql.DB (https://github.com/nikitsenka/bank-go/blob/2ab1ef2ce8959dd1bc5eb5d324e39ab296efbbe5/bank/postgres.go#L57).
xxxxxxxxxx
func GetBalance(client_id int) int {
db, err := newDb()
...
}
func newDb() (*sql.DB, error) {
dbinfo := fmt.Sprintf("host=%s user=%s password=%s dbname=%s sslmode=disable",
DB_HOST, DB_USER, DB_PASSWORD, DB_NAME)
db, err := sql.Open("postgres", dbinfo)
...
return db, err
}
The first thing that you need to know about is that a sql.DB isn’t a database connection. Here is what official documentation says about it:
DB is a database handle representing a pool of zero or more underlying connections. It's safe for concurrent use by multiple goroutines. The sql package creates and frees connections automatically; it also maintains a free pool of idle connections. The returned DB is safe for concurrent use by multiple goroutines and maintains its own pool of idle connections. Thus, the Open function should be called just once. It is rarely necessary to close a DB.
If an app fails to release connections back to the pool, it can cause a db.SQL to open many other connections, potentially running out of resources (too many connections, too many open file handles, lack of available network ports, etc). And following code snippet can cause a connections leak (https://github.com/nikitsenka/bank-go/blob/2ab1ef2ce8959dd1bc5eb5d324e39ab296efbbe5/bank/postgres.go#L74):
xxxxxxxxxx
func GetBalance(client_id int) int {
...
err = db.QueryRow(...)
checkErr(err)
db.Close()
...
}
func checkErr(err error) {
if err != nil {
fmt.Println(err);
panic(err)
}
}
If QueryRow failed with an error, it will panic during the checkErr
call, and db.Close
will never be called.
The Java implementation also doesn't use connections pool, but at least it doesn't leak connections since connections opening are in the try block (https://github.com/nikitsenka/bank-java/blob/4708bccaff32023078fbd8e6a1e8e4c1d1d4296f/src/main/java/com/nikitsenka/bankjava/BankPostgresRepository.java#L67):
xxxxxxxxxx
public Balance getBalance(Integer clientId) {
Balance balance = new Balance();
try (Connection con = getConnection();
PreparedStatement ps = getBalanceStatement(con, clientId);
ResultSet rs = ps.executeQuery()) {
...
}
Another thing that caught our attention is that the amount of goroutines and DB connections were not limited in the app, but Postgres DB has a connection limit. Since there was no mention of tuning Postgres DB, the default value for the connection limit is 100. This means that for 10k users test run all of the Java threads/goroutines were competing for limited resources.
After that, we started to analyze the JMeter test scenario (https://github.com/nikitsenka/bank-test/blob/master/jmeter/bank-test.jmx). Each run creates a new user, then transfers money from the user with id 1 to the user with id 2, and then gets a balance for the user with id 1. A newly created user is ignored. Insert into and select from the transaction table happens for the same user id, which also might slow the performance, especially if DB doesn't wipe after each run.
The last thing that caught our attention was an instance type that was used to run Java and Go services. T2 instances have a good money value, but they are not the best choice for performance testing due to their burstable nature. You can't guarantee that the result will be the same from run to run.
What's Next...
Before running tests, we decided to tackle the issues we found. Our goal was not to make perfect/refined code but to fix the issues. We copied the author's repository and applied fixes.
For the Go implementation, we moved the creation of the sql.DB
to the application startup and closed it when the application shuts down. We also removed panicking when the DB operation failed. Instead of panicking, the application returns an error code 500 with the error message to the client. We only kept panicking during sql.DB
creation. There was no point to start the application for testing if DB is not running. Besides, we made it possible to configure the connections limit using an environment variable.
For the Java implementation, we added a connection pool. We chose a Hikary connection pool since it's considered to be lightweight and better performing (https://github.com/brettwooldridge/HikariCP#jmh-benchmarks-checkered_flag). We also made it possible to configure a connections limit using environment variables, as we did for the Go implementation.
For both versions, we changed the Dockerfile to use a multistage build based on alpine's images. It doesn't affect the performance, but it makes the final image considerably smaller.
You can check the final result here:
We also modified the JMeter test scenario. The new test scenario for each run creates two new users with predefined balances. Then, it performs a get balance request for each user in order to check their balances. After that, it transfers money from one user to another. In the end, it performs another get balance request for each user in order to check their balances are correct after transfer.
New Experiment
To test modified versions of the service for a performance, we've picked the following instance types from AWS:
Service itself (both Java and Go version) m5.large.
Jmter runner m5.xlarge.
PosgreSQL c5d.xlarge.
All elements of the experiment (services Java and Go versions, JMeter, and PostgreSQL) were run in Docker containers using AWS ECS. For the PostgreSQL container, we've created a volume to store all data to prevent using a container's writable, which impacts the DB, especially on heavy loads.
After each performance test, we started PostgreSQL from scratch to prevent any influence from previous runs.
We picked m5.large to host services because of its balance of compute, memory, and networking resources. For PostgresSQL, we picked c5d.2xlarge, as it's compute-optimized and equipped with local NVMe-based SSD block-level storage.
Below, you can see JMeter output of the 4k concurrent users for the Go version of the service:
Here is the JMeter output of the 4k concurrent users for the Java version of service:
JMeter output of run 10k concurrent users for the Go version of the service:
And the same 10k concurrent users running on the Java version of the service:
You can find the JMeter results report for the Go version here, and here for the Java version.
Go version of service CPU/memory footprint:
Java version of service CPU/Memory footprint
Summary
In this article, we're not going to conclude anything or pick up a winner. You need to decide for yourself whether you want to switch to a new programming language and contribute an effort to become an expert in it or not. There are plenty of articles that cover the pros and cons for both Java and Go. At the end of the day, any programming language that can solve your problem is the right one.
Further Reading
- Why We Switched From Python to Go, Part 1.
- Comparing Golang with Java.
- Go Microservices, Part 2: Building Our First Service.
Opinions expressed by DZone contributors are their own.
Comments