What Java Developers Need to Know About Geo-Distributed Databases

A helpful launchpad for connectivity and CRUD operations geo-distributed DBs in Java.

Denis Magda

CORE ·

Updated Nov. 07, 22 · Tutorial

Likes (8)

Comment

Save

5.4K Views

I’ve been working with distributed systems, platforms, and databases for the last seven years. Back in 2015, many architects began using distributed databases to scale beyond the boundaries of a single machine or server. They selected such a database for its horizontal scalability, even if its performance remained comparable to a conventional single-server database.

Now, with the rise of cloud-native applications and serverless architecture, distributed databases need to do more than provide horizontal scalability. Architects require databases that can stay available during major cloud region outages, enable hybrid cloud deployments, and serve data close to customers and end users. This is where geo-distributed databases come into play.

As a Java developer, two questions jump out at me:

How much effort should I put into creating applications for cloud-native geo-distributed databases?
Will it be just a quick refactoring of my existing application or a complete redesign/rewrite?

Effort varies from use case to use case. But even then, you can learn a lot from the “getting started” experience when building a simple application. In this post, I’ll share key insights when creating a Java application using YugabyteDB as the geo-distributed database. You can find my complete source code on GitHub. Now let’s dive in!

Database Deployment

YugabyteDB provides a fully-managed cloud version that supports AWS and GCE, similar to other cloud-native databases. This is a big deal for me as a developer. I want nothing more than to get an instance running so I can focus on my application logic.

Eventually, it took me a few minutes to spin up a free instance on AWS and copy the connectivity details for my application. As expected, the experience was smooth and quick. Gone are the days when I had to download, install, and configure databases before writing a single line of code.

YugabyteDB Cloud Panel with my running instance

Database Connectivity

As a backend developer, I’m grateful for a database that speaks SQL natively. This shortens the learning curve and lets me reuse existing logic. Even if I use Spring Data or Micronaut, I still write and execute direct SQL queries.

As long as YugabyteDB speaks in Postgres dialect, I wanted my simple Java application to connect to my running database instance through a good-old JDBC interface. With YugabyteDB, you can pick from the standard PostgreSQL JDBC driver or native Yugabyte JDBC driver that comes with some performance perks. I selected the latter.

In a few minutes, I added my laptop’s IP address to the IP allow list in YugabyteDB Managed. I also compiled and launched my sample application with a successful connection to the cloud instance. The JDBC connectivity logic was no different than what MySQL, Postgres, and other relational databases require me to follow. This was a very good sign.

     Java 
   
   YBClusterAwareDataSource ds = new YBClusterAwareDataSource();

ds.setUrl("jdbc:yugabytedb://" + settings.getProperty("host") + ":"
    + settings.getProperty("port") + "/yugabyte");
ds.setUser(settings.getProperty("dbUser"));
ds.setPassword(settings.getProperty("dbPassword"));

// Additional SSL-specific settings. See the source code for details.

Connection conn = ds.getConnection();

What’s even better is even though I used a free single-node instance for my tests, the connectivity logic remains the same even if my database counts 60 nodes spanning across several continents. For application developers, YugabyteDB is a single logical instance with all the complexity related to data partitioning, inter-node communication, and distributed queries execution happening transparently behind the scenes.

Basic CRUD Operations

Once the connectivity logic was put in place, I introduced a few methods to create a sample table and then query and update its records through the JDBC connection. This meant my simple Java application had to be as primitive as possible. As a result, I picked a pretty basic use case: a money transfer between two accounts.

The sample table was created with the standard CREATE TABLE command:

     Java 
   
 
 
   Statement stmt = conn.createStatement();

stmt.execute("CREATE TABLE IF NOT EXISTS " + TABLE_NAME +
    "(" +
    "id int PRIMARY KEY," +
    "name varchar," +
    "age int," +
    "country varchar," +
    "balance int" +
    ")"); 
  

And populated with exactly two records (enough to assess the getting started experience):

     Java 
   
   stmt.execute("INSERT INTO " + TABLE_NAME + " VALUES" +
    "(1, 'Jessica', 28, 'USA', 10000)," +
    "(2, 'John', 28, 'Canada', 9000)");

Finally, SQL queries that would query and update a similar table in Postgres or MySQL worked the same way in my geo-distributed database. Below is a complete implementation of two methods: the first selects your distributed records, and the second updates the records consistently with distributed transactions:

     Java 
   
 
 
   private static void selectAccounts(Connection conn) throws SQLException {
	Statement stmt = conn.createStatement();
  
    ResultSet rs = stmt.executeQuery("SELECT * FROM " + TABLE_NAME);

	while (rs.next()) {
    	System.out.println(String.format("name = %s, age = %s, country = %s, balance = %s",
        	rs.getString(2), rs.getString(3), rs.getString(4), rs.getString(5)));
  	}
}

private static void transferMoneyBetweenAccounts(Connection conn, int amount) throws SQLException {
	Statement stmt = conn.createStatement();

  	try {
    	stmt.execute(
      	    "BEGIN TRANSACTION;" +
      	    "UPDATE " + TABLE_NAME + " SET balance = balance - " + amount + "" + " WHERE name = 'Jessica';" +
      	    "UPDATE " + TABLE_NAME + " SET balance = balance + " + amount + "" + " WHERE name = 'John';" +
      	    "COMMIT;"
    	);
 	} catch (SQLException e) {
    	if (e.getErrorCode() == 40001) {
      		// The operation aborted due to a concurrent transaction trying to modify the same set of rows.
      		// Consider adding retry logic for production-grade applications.
      		e.printStackTrace();
    	} else {
      		throw e;
    	}
  	}

    System.out.println();
    System.out.println(">>>> Transferred " + amount + " between accounts.");
} 
  

Closing Thoughts

I was glad to confirm the creators of contemporary geo-distributed databases shield me (the application developer) from most complexities related to distributed systems. I started a distributed database instance in a minute, connected to it as a single logical instance, and queried the database through familiar SQL and JDBC interfaces. I do acknowledge my simple Java application is far from a real-world solution containing low-level, database-specific optimizations. However, getting started was as easy as a single-server database, which is a big deal.

You can explore my complete application on GitHub. I encourage you to try and run it yourself.

Database Relational database Java (programming language) dev application

Opinions expressed by DZone contributors are their own.

Related

Trending