The Stairway to Apache Kafka® Tiered Storage

Tiered Storage, a feature of Apache Kafka 3.6 which allows the offloading of data to object storage in the cloud, is a prime example of the power of open source.

Matthew de Detrich

Sep. 28, 23 · Opinion

Likes (5)

Comment

Save

4.6K Views

It started back during Current 2022 when I attended Satish Duggana's session, giving a technical overview of Tiered Storage, a feature of Apache Kafka® that allows the offloading of data to object storage in the cloud. It was an excellent presentation, providing a technical overview of what Tiered Storage hoped to achieve, and it inspired me so much that I whipped up my laptop and started implementing an S3 prototype plugin at the end of the talk.

What Is Tiered Storage and Why Should You Be Interested?

Tiered Storage is arguably one of the most sought-after features of Kafka 3.6, allowing Kafka’s core data to be stored in other locations, such as object storage, in addition to hard disks in a transparent manner, without any changes to Kafka’s producers or consumers. The Kafka brokers control whether the data is stored on local disks, fast but expensive and limited, or in alternative storage places, such as Amazon S3. When Tiered Storage is properly configured, it means you can have the best of both worlds: recent data is stored on local fast disks (as is currently), and older, less frequently accessed data can be stored elsewhere where it's cheaper and space requirements are less of a concern (sometimes unlimited!)

Satish's talk showed promise for Tiered Storage, and while upstream progress in the community was ongoing, I wanted to jump-start the process. I ended up making a proactive decision to kickstart Tiered Storage in the open so that both Aiven’s Open Source Program Office (OSPO) and the rest of the Kafka community could start trying it out.

Porting Tiered Storage to Apache Kafka 3.3

When investigating the existing state of Tiered Storage in Apache Kafka, shortly after Current 2022, we found out that there were two almost complete implementations of Tiered Storage, one by Uber for Kafka 2.8 and the other for Kafka 3.0 by LinkedIn. With initial pointers from LinkedIn, I started forward porting to Kafka 3.3 (i.e., taking the tiered storage-specific changes from LinkedIn’s Kafka’s 3.0 branch to Kafka 3.3 while updating those changes to make sure they work in Kafka 3.3), the result of which can be seen at the branch here.

This was not an easy feat since significant changes were done to the LogSegment subsystem, Kafka’s core data structure, defining how Kafka log data is stored on the disk. The subsystem was, therefore, critical to Tiered Storage. Specifically, in Kafka 3.0, there was only a single Log, whereas in Kafka 3.3, this was split out into a LocalLog and a UnifiedLog in preparation for Tiered Storage.

Great care had to be taken when forward porting, with each commit from LinkedIn’s 3.0 Kafka fork having a corresponding cherry-pick commit for Apache Kafka 3.3. In more detail, there are two different ways one can forward port such a big change. One way is to cherry-pick modifications (i.e., copy and apply to a new branch, in our case, Kafka 3.3), and if there are any issues, make additional commits to fix those. While this is simple, it makes it hard for external observers to track what is going on due to the additional noise. Instead, I opted for strictly one cherry-pick commit matching one original commit in LinkedIn’s Kafka 3.0 fork. Even if the selected commit requires additional changes for compatibility or bug fixing, having a matching commit makes it easier to track bugs/issues.

Testing Tiered Storage in Apache Kafka 3.6

While the forward porting was an obvious first step to getting Tiered Storage running on a more modern version of Kafka, it was only part of the parcel. Tiered Storage requires extensive testing since it touches both the core of Kafka, changing the way that data is stored and the needed calculation on where to store data and how to read it. To make things more difficult, Tiered Storage involves many moving parts: the pure Kafka side of Tiered Storage represents only an interface to an external storage system; it’s up to others to write an implementation, called a plugin, defining how and where to store/read the data in the remote storage.

Moreover, to properly test the functionality, we needed to facilitate the exposure of metrics crucial to understanding how Tiered Storage is performing. An example of such work is a Docker Image, allowing you to easily create a full Kafka cluster along with our Tiered Storage Amazon S3 plugin with the needed metrics already preconfigured. The docker image was necessary, especially when collaborating with other teams in the community since it provides a common environment we can test and validate. The benefits of working in the open also mean that interested contributors (from companies such as Apple, Datadog, and Slack) found the project and were keen to help out and test not only the core Tiered Storage functionality but also the open source plugin that was being developed for Amazon S3 object storage at the time (see here).

All the contributors were key in testing and providing accurate feedback that helped us discover and validate various bugs as well as performance suggestions for our Amazon S3 plugin implementation. In the spirit of open source, we collected and reported these issues in the open (see this GitHub repository), referencing them to upstream Kafka as necessary.

Bugs and Fixes Found in the Process

As expected, the community was able to overcome many challenges, from interesting race conditions that only occurred when testing and not in production (first reported here and then solved properly here) to changes and updates to multiple KIPs. An example of such changes can be seen in KIP-917 made by Ivan Yurchenko, which added custom metadata to RemoteLogSegment, necessary for a future Azure object storage implementation. Not to mention the many improvements by Jorge Esteban Quilcate Otoya to code quality and documentation in pull requests such as KAFKA-15131, KAFKA-15135, and KAFKA-15181.

Along the way, various upstream pull requests started getting attention and began to land in Kafka; of note are pull requests that brought forward changes all the way back from Kafka 3.0 to what is now available in Kafka 3.6 (with KAFKA-9564 being an example of such a pull request that was manually forward ported here). While it may appear that the work in forward porting Tiered Storage to Kafka was in vain since the first preview of Tiered Storage landed in Kafka 3.6, the critical thing to note is that the work that we did unblock not only Aiven’s internal teams to work on the Amazon S3 and GCS plugins and related features but also other companies and contributors to collaborate in the development of the feature.

Building New Features in the Open, Together

Ultimately, this demonstrates the power of open source: a community-driven software development approach that allows the use of the work of the community to enhance the technology, introduce new features, and bring Kafka forward with a collaborative approach that sets the stage for what’s next in data streaming with Apache Kafka. If it were not for Uber’s implementation of Tiered Storage being open, then LinkedIn would not have been able to use and benefit from it, and in turn, others would have likely never started.

All in all, it's this that we should take solace in because while others in the industry move away from open source, this journey demonstrates how powerful open source is in fostering collaboration between companies and users.

Open source kafka

Opinions expressed by DZone contributors are their own.

Related

Trending