RocksDB: The Bedrock of Modern Stateful Applications
Apache Flink, Kafka Streams, and Apache Kvrocks use RocksDB as a persistent layer. Consider before embarking on the creation of your own data storage engine; use RocksDB.
Join the DZone community and get the full member experience.
Join For FreeRocksDB is an embedded database that manages (key, value) pairs in the disk by a single write process. Originally developed on top of LevelDB (by Google); Meta founded a database that in 10 years, is projected to become the default choice for storage in application mediums.
Adopters
- MyRocks is a MySQL storage engine that was built on top of RocksDB
- MongoRocks is a Mondo DB storage engine that was built on top of RocksDB
- Dragon is a distributed graph query engine
- LogDevice is a distributed data store for logs
- Apache Samza uses RocksDB for State Management in streaming aggregates
Yahoo
- Sherpa is a distributed data store
- CockroachDB is a database with PostgreSQL PostgreSQL-compatible interface that uses RocksDB as a storage engine
- Apache Flink for state management in data streaming aggregates
- Kafka Streams / KSQL for state management in data streaming aggregates
- TiKV for the storage engine
- Uber for task queue
- Apache Doris for metadata management
You can other adopters on the RocksDB wiki page about users and use cases.
Usage Cases
RocksDB supports only one writer process (not thread) and multiple read-only focused instances. Secondary instances (not named as read-only instances) support read-only mode with dynamic catch-up updates from the primary replica that are triggered by the scheduler or some event. They can be run on different hosts which allows for it to distribute a read workload across nodes. But at the same time, secondary instances don't support snapshot-based read and tailing iterators.
The main operations that supported are:
- Get and MultiGet
- Put
- Delete and DeleteRange
- Write and WriteBatch
- CompactRange and CompactFiles
- Iterator over records
That makes this database key value-oriented. With different value type support including large files with BlobDB extension that allows to store effectively large files in separate data files (as LOBs in PostgreSQL).
Apache Kvrocks (compatible with Redis protocol) describes how they implemented Redis-related data structures on top of RocksDB here.
Apache Kafka Streams use RocksDB for state management. For example, KTable which represents a stream's snapshot uses KeyValueStore for data management that is implemented by RocksDBStore. But at the same time, KTable does not necessarily materialize as a local store because KTable may persist to a topic. This change was introduced by KAFKA-2856 (commit).
TiKV is a highly scalable and low latency key-value database that uses RocksDB for managing data in disk because RocksDB is mature and high-performance. They mention prefix bloom filter, event listener and ingest external file capabilities when describing the advantages of RocksDB. Other features such as data distribution across nodes, transactions, and leader elections are implemented by TiKV.
For a long time, CockroachDB used RocksDB as a local disk-backed KV store that provides efficient writes and range scans to enable performant SQL execution (SIGMOD 20). But at the same time, they rewrite this solution in Go because it allows them to tune better for their workload (Pebble). So they started with a boxed solution, and only then replaced it with their own.
Your Adoption
I recommend that you start with simple solutions like SQLite or H2 that support SQL and relations. It might be easier to start and support because the community has a lot of expertise.
When you face a performance barrier like a wall that is not breakable by a previous solution it might be time to use tools like RocksDB. It will require more investigation and deeper understanding because in some cases you might go to source code instead of StackOverflow.
But at the same time, I suggest you ignore your own data format before RocksDB adoption because it might be hard to overperform this solution and it might take less time to adopt RocksDB than to do it yourself.
Opinions expressed by DZone contributors are their own.
Comments