Efficiently Transforming JDBC Query Results to JSON
In this article, learn how to avoid common pitfalls when returning large amounts of data from a REST service.
Join the DZone community and get the full member experience.
Join For FreeA lot of enterprise data is stored in relational databases and accessed via SQL queries. Many web services are little more than HTTP-based wrappers around such queries.
For example, the following service method (built using the open-source Kilo framework) retrieves all rows from an employees
table and returns the results to the caller as JSON:
@RequestMethod("GET")
public List<Employee> getEmployees() throws SQLException {
var queryBuilder = QueryBuilder.select(
EMPLOYEE_NUMBER.as("employeeNumber"),
FIRST_NAME.as("firstName"),
LAST_NAME.as("lastName"),
GENDER,
BIRTH_DATE.as("birthDate"),
HIRE_DATE.as("hireDate")
).from(Employee.Schema.class);
try (var connection = getConnection();
var statement = queryBuilder.prepare(connection);
var results = new ResultSetAdapter(queryBuilder.executeQuery(statement))) {
return results.stream().map(result -> BeanAdapter.coerce(result, Employee.class)).toList();
}
}
Unfortunately, this solution, while straightforward, is not efficient. The entire data set must be traversed twice – once to read it from the database and again to write it to the output stream. Additionally, the caller will not begin receiving a response until all of the rows have been read, and none of the elements will become eligible for garbage collection until all of the results have been written.
For small result sets, the latency and memory implications associated with this approach might be acceptable. However, for larger data volumes, the following alternative may be preferable. The query is executed on a background thread, and the transformed results are streamed back to the caller via a pipe:
@RequestMethod("GET")
@ResourcePath("stream")
public List<Employee> getEmployeesStream() {
var queryBuilder = QueryBuilder.select(
EMPLOYEE_NUMBER.as("employeeNumber"),
FIRST_NAME.as("firstName"),
LAST_NAME.as("lastName"),
GENDER,
BIRTH_DATE.as("birthDate"),
HIRE_DATE.as("hireDate")
).from(Employee.Schema.class);
var pipe = new Pipe<Employee>(4096, 15000);
executorService.submit(() -> {
try (var connection = getConnection();
var statement = queryBuilder.prepare(connection);
var results = new ResultSetAdapter(queryBuilder.executeQuery(statement))) {
pipe.accept(results.stream().map(result -> BeanAdapter.coerce(result, Employee.class)));
} catch (SQLException exception) {
throw new RuntimeException(exception);
}
});
return pipe;
}
From the Kilo documentation:
The
Pipe
class provides a vehicle by which a producer thread can submit a sequence of elements for retrieval by a consumer thread. It implements theList
interface and produces an iterator that returns values as they become available, blocking if necessary.
The pipe in this example is configured with a capacity of 4K elements and a timeout of 15s. Limiting the capacity ensures that the producer does not do more work than necessary if the consumer fails to retrieve all of the data. Similarly, specifying a timeout ensures that the consumer does not wait indefinitely if the producer stops submitting data.
This implementation of the method is slightly more verbose than the first one. However, because no intermediate buffering is required, results are available to the caller sooner, and CPU and memory load is reduced.
Note that while the examples above use Kilo’s QueryBuilder
and ResultSet
adapter classes to process results, the same approach can be used with an ORM framework such as Hibernate:
@RequestMethod("GET")
@ResourcePath("hibernate")
public List<Employee> getEmployeesHibernate() throws SQLException {
var configuration = new Configuration();
configuration.addAnnotatedClass(HibernateEmployee.class);
try (var connection = getConnection();
var sessionFactory = configuration.configure().buildSessionFactory();
var session = sessionFactory.withOptions().connection(connection).openSession()) {
var criteriaQuery = session.getCriteriaBuilder().createQuery(Employee.class);
var query = session.createQuery(criteriaQuery.select(criteriaQuery.from(HibernateEmployee.class)));
return query.list();
}
}
@RequestMethod("GET")
@ResourcePath("hibernate-stream")
public List<Employee> getEmployeesHibernateStream() {
var pipe = new Pipe<Employee>(4096, 15000);
executorService.submit(() -> {
var configuration = new Configuration();
configuration.addAnnotatedClass(HibernateEmployee.class);
try (var connection = getConnection();
var sessionFactory = configuration.configure().buildSessionFactory();
var session = sessionFactory.withOptions().connection(connection).openSession()) {
var criteriaQuery = session.getCriteriaBuilder().createQuery(Employee.class);
var query = session.createQuery(criteriaQuery.select(criteriaQuery.from(HibernateEmployee.class)));
try (var stream = query.stream()) {
pipe.accept(stream);
}
} catch (SQLException exception) {
throw new RuntimeException(exception);
}
});
return pipe;
}
The performance of the Hibernate versions is comparable to the preceding examples, although they do result in slightly higher memory consumption and GC activity.
The complete source code for this example can be found here.
Published at DZone with permission of Greg Brown, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments