Simplify Your Microservices Architecture With a Data API

In this post, learn how adopting a data API gateway dramatically reduces development and maintenance costs for microservices architectures.

Jeffrey Carpenter

Aug. 23, 22 · Analysis

Likes (14)

Comment

Save

18.7K Views

This is an article from DZone's 2022 Microservices and Containerization Trend Report.

For more:

Read the Report

Have you encountered challenges in how to manage data in a microservices architecture? In this article, we examine traditional approaches and introduce the data API gateway (also sometimes known as a "data gateway"), a new type of data infrastructure. We explore the features of a data API gateway, why you should implement it, and how to apply it to your architecture.

The Traditional Data Service Pattern

First, let's consider how data is managed in microservices architectures. A common pattern is a layer of data services that perform "CRUD" (create, read, update, delete) operations. For example, consider the notional hotel reservation application shown in Figure 1.

Figure 1: Data services in a sample microservice architecture

This microservices architecture includes a layer of data services which manage specific data types including hotels, rates, inventory, reservations, and guests, and a layer of business services which implement specific processes such as shopping and booking reservations. The business services provide a primary interface to web and mobile applications and delegate the storage and retrieval of data to the data services. The data services are responsible for performing CRUD operations on an underlying database.

While there are many ways of integrating and orchestrating interactions between these services, the basic pattern of separating services responsible for data and business logic has been around since the early days of service-oriented architecture (SOA).

Identifying, Designing, and Implementing Data Services

The typical approach to developing these microservices has included steps like the following:

Identifying services to manage specific data types in the domain using a technique such as domain-driven design. For more on the interaction between domain-driven design, service identification, and data modeling, see Chapter 7 of Cassandra, The Definitive Guide: 3rd Edition.
Designing services including APIs and schema to manage assigned data types. Each individual service is the primary owner of a specific data type and is responsible for data storage, retrieval, and potential messaging or streaming. We'll expand on the implications of this for database selection below.
Implementing services using a selected language and framework. In the Java world, frameworks like Spring Boot make it easy to build services with an embedded HTTP server that are then packaged into VMs or containers. Quarkus is a more recent framework which can build, test, and containerize services in a single CI workflow.

Data Services and Polyglot Persistence

Many early SOA architectures included services that interacted with a single legacy relational database schema. An unfortunate consequence of this was a tendency to "integrate by database," where services might read and write freely to multiple tables. This lack of strong ownership often led to data integrity issues that were hard to debug.

As the move toward large-scale microservices architectures in the cloud began in the 2010s, large-scale innovators, including Netflix, advocated strongly for independent services managing their own data types. One consequence of this was that individual data services were free to select their own databases, a pattern known as polyglot persistence. An example of what this might look like in our hypothetical hotel application is shown in Figure 2.

Figure 2: Polyglot persistence approach for microservices architectures

In this architecture, data of a modest size that changes less frequently, like hotel descriptions, might be a natural choice for a document database or traditional relational database. Data with high volume or high read/write traffic such as rates, inventory, and reservations might use a clustered NoSQL-based solution in order to scale effectively. Other data services might be a front for a third-party API, such as guest information sourced from a customer relationship management (CRM) system.

Replacing Data Services With a Data API Gateway

In creating multiple data services, development teams often find that the implementations are highly similar, almost boilerplate code, due to their focused responsibility of executing simple CRUD operations on top of a database backend. Recognizing this duplication of effort, many organizations have begun to adopt data API gateways as an alternative to maintaining a layer of containerized data services.

A data API gateway is a piece of software infrastructure that provides access to data via APIs of various styles including REST, gRPC, and others. The gateway abstracts the details of storing and retrieving data using one or more persistent stores. This allows application developers to focus on writing business services that access data via easy-to-use APIs instead of having to learn the intricacies of a database query language.

Figure 3 shows an example of how such a gateway could be applied to the hotel application example. The data API gateway takes on the responsibility of managing data persistence for hotels, rates, inventory, and other data types, eliminating the need for an entire layer of data services.

Figure 3: Sample usage of a data API gateway

Data types can be added to the gateway by providing a schema or data model. Alternatively, document-style endpoints can provide a "schema-less" experience in which any valid JSON document can be stored, such as a hotel description, the structure of which could change frequently.

For this reason, adopting a data API gateway is quite similar to implementing the data services pattern. The design consists of identifying key data types and creating schema or JSON formats to describe them. These data types are then made available via APIs provided by the gateway.

Comparing API Styles

Data API gateways provide developers the freedom to access data types through the API that makes the most sense for their client services and applications. Figure 4 compares some of the most common API styles in terms of how they structure data and their performance characteristics.

Figure 4: Characteristics of APIs provided by a data API gateway

API styles like gRPC provide more structured data representations which can lead to more optimal performance. GraphQL and REST APIs provide more flexibility in how data is represented at the cost of additional latency. The maximum flexibility is provided by document-style APIs which can store and search JSON in whatever format the client chooses, at the cost of potentially lower performance for more complex queries.

Data API Gateway Projects

Many organizations have built their own data API gateways, and a few of these are in various stages of being released as open-source projects. An example is Stargate, an open-source project that provides multiple API styles as a stateless proxy layer on top of Apache Cassandra. GraphQL frameworks such as the Apollo Supergraph Platform or Netlify's OneGraph might also be considered a specific tailoring of data API gateway pattern in the sense that they aggregate data from multiple persistence backends and APIs.

Deploying a Data API Gateway

Deploying a containerized gateway potentially includes multiple API services and backing data stores. Let's look at Stargate as an example of a data API gateway that is itself deployed as a microservices application. Figure 5 shows an example deployment of Stargate in Kubernetes.

Figure 5: Example deployment of Stargate in Kubernetes with a backing Cassandra cluster

A Cassandra cluster is deployed using a StatefulSet, which allows pods to be bound to PersistentVolumes for high availability of stored data. Stateless Stargate coordinator nodes and API services, such as document, gRPC, and REST, are managed in Kubernetes Deployments so that each microservice can scale independently. Kubernetes Services provide load balancing across multiple microservices instances.

Conclusion

The data API gateway is a new type of data infrastructure which can help eliminate layers of CRUD-style microservices that you have to develop and maintain. While there are multiple styles of gateway, they have a common set of features that benefit both developers and operators. Data API gateways enable developer productivity by providing a variety of API styles over a single supporting database. From an operations perspective, data API gateways and their supporting databases can be run in containers alongside other applications to simplify your overall deployment process. In summary, adopting a data API gateway is a great way to reduce development and maintenance cost for microservices architectures.