Redis TLS Can Significantly Reduce Performance — A Look at How KeyDB Addressed This
We saw ~ 36-61% performance decline with Redis 6 TLS enabled. KeyDB addressed this using a multithreaded architecture to avoid the performance decline.
Join the DZone community and get the full member experience.
Join For FreeWe were extremely excited about TLS (Transport Layer Security) support which arrived in the '6.0' versions of Redis and KeyDB. TLS database connections are part of a continuing trend towards defense in depth which has been a long time in the making, first starting with google encrypting links between their datacenters in 2013.
Unfortunately with Redis, TLS came with a big hit to performance ranging from 36-61%. While security is important, making a trade-off with performance may not always be a viable compromise. We thought carefully about the TLS implementation in KeyDB to try and prevent our users from experiencing this. By taking advantage of KeyDB's multithreaded architecture we were able to maintain performance achieving nearly 1M ops/sec which was over 7X faster than the Redis TLS implementation.
For those who may not be aware, KeyDB is an open source project and compatible with Redis API, protocol and clients.
An Analysis of TLS Perf by Redis
I was recently looking through a detailed github issue response by a RedisLabs performance engineer . He did a great job analyzing CPU holdup & performance degradation when using TLS.
Here he suggests there is approximately a 36% drop in ops/sec with single threaded performance. In his analysis he states "TLS involves another layer in the stack, that brings additional overhead. The added SSL/TLS stack implies 28% CPU time devoted to it ( writing/reading bytes from ssl connetion, encrypt/decrypt and integrity check" ... "Based on the flame graphs they moved from spending 17% on the handler to spending 45% of the CPU time with tls."
Redis currently does not support using io-threads per their docs. RedisLabs documentation also warns that "TLS encryption can significantly impact database throughput and latency"
A Comparison of Redis, TLS and Multithreaded KeyDB
Benchmarking Ops/sec
Running tests of our own, we saw the same results (36% decline) to those stated by Redis for a single node. We will also include a comparison with Redis io-threads and how KeyDB performs with TLS enabled. This test looks at performance of a single node. Tests were done on AWS m5 instances of adequate size to ensure the machines were not a bottleneck.
The tests below were performed using memtier on a m5.8xlarge with the following command:
xxxxxxxxxx
$ memtier_benchmark --hide-histogram --tls --cert=/path/to/redis.crt --key=/path/to/tls/redis.key --cacert=/path/to/tls/ca.crt -s 172.31.56.132 --threads=32
KeyDB and Redis were operated on a m5.4xlarge. To see more details on setup for reproducing benchmarks please see the end of this blog.
You will see that our testing shows the same measured decline of 36% with TLS enabled on Redis6 (single threaded). However if you were previously using the io-thread feature, you could see a 61% drop in performance as io-threads is not supported using TLS. This is stated in TLS.md
Why KeyDB Perf Does Not Decline
With KeyDB there is almost no decline using TLS as multithreading is supported and it can scale vertically. The performance remains considerably higher in general due to architectural difference between the projects. There are considerably more CPU resources used with TLS, but that is something KeyDB is great at accounting for.
With 8 threads allocated, we were hitting closer to 800K ops/sec vs 1M ops/sec without TLS. However increasing threadcount as high as 16 threads enabled us to get back up to near the 1M ops/sec mark.
This enables us to compensate to the load and still offer the security without the penalty to performance for our users who rely on the perf.
Latency Benchmark (Lower Is Better)
Similar trends can be seen in the latency measurements of the same test performed above. You can see that latency is significantly higher at these loads when using TLS. KeyDB not only serves at very high volumes, but the latency is also up to 7x lower that Redis with TLS. It can be noted latencies measured with memtier are pushing peak loads and when not heavily loaded will achieve much lower latencies. This should be taken as a relative comparison under load.
Flamegraphs
In the RedisLabs analysis they provided flamegraphs for context. For those interested we ran flame graphs on Redis 6, Redis 6 with io-threads, Redis 6 + TLS, KeyDB and KeyDB + TLS. Full expandable breakdowns can be seen by following the links below the charts.
Links to complete expandable flamegraphs:
- Redis (single-threaded)
- Redis with io-threads
- Redis with TLS enabled
- KeyDB (Multithreaded)
- KeyDB with TLS enabled (multithreaded)
Conclusion
TLS encryption is a great option when it comes to security, however if you are currently using Redis this may come with a performance penalty. If you are not heavily loading Redis you may be able to handle the additional overhead, but its something that should be taken into account.
To summarize performance:
- Redis with TLS experienced a 36% drop from single threaded Redis, and a 61% drop compared to using Redis with io-threads enabled.
- KeyDB achieved close to 1 million ops/sec with little to no performance drop after adding additional threads
- KeyDB got close to 1 million ops/sec with TLS enabled, while Redis got ~130k ops/sec
Options for Redis users to increase performance would come in the form of sharding or increasing cluster size. KeyDB can also be used as a drop in replacement for the Redis binary.
KeyDB comes in at over 7X faster than Redis when using TLS. It may be a viable alternative when using this feature if performance is an issue for you.
Find out More:
Reproducing Benchmarks
For benchmark testing, an aws m5.8xlarge was needed as the benchmarking machine which used memtier as the benchmark tool. For the Redis/KeyDB instance a m5.4xlarge was used. Machine size selection was based off terminal performance ensuring the machine was not the bottleneck. A larger machine size would not make a difference in results, however a smaller machine may result in lower throughput.
If you are using TLS for the first time you can generate certificates simply cloning the github project and running ./utils/gen-test-certs.sh
to create the certificates for Redis or KeyDB.
You can now run keydb-server with the following command:
xxxxxxxxxx
$ keydb-server --tls-port 6379 --port 0 --tls-cert-file ./tests/tls/redis.crt --tls-key-file ./tests/tls/redis.key --tls-ca-cert-file ./tests/tls/ca.crt --server-threads 16 --server-thread-affinity true --protected-mode no
For Redis we ran the following command:
xxxxxxxxxx
$ redis-server --tls-port 6379 --port 0 --tls-cert-file ./tests/tls/redis.crt --tls-key-file ./tests/tls/redis.key --tls-ca-cert-file ./tests/tls/ca.crt --protected-mode no
In order to use memtier to connect via TLS ensure you transfer the ./test/tls/* files over to the benchmarking machine. You can the run the following command:
xxxxxxxxxx
$ memtier_benchmark --hide-histogram --tls --cert=/path/to/redis.crt --key=/path/to/tls/redis.key --cacert=/path/to/tls/ca.crt -s 172.31.56.132 --threads=32
For tests without TLS, the following commands were used: KeyDB:
Redis:
xxxxxxxxxx
$ redis-server --io-threads 8 --protected-mode no
Memtier:
xxxxxxxxxx
$ memtier_benchmark -s 172.31 .56 .132 --hide-histogram --threads=32
Avoid Bottlenecks
- Because of KeyDB's multithreading and performance gains, we typically need a much larger benchmark machine than the one KeyDB is running on. We have found that a 32 core m5.8xlarge is needed to produce enough throughput with memtier. This supports throughput for up to a 16 core KeyDB instance (medium to 4xlarge)
- When using Memtier run 32 threads.
- Run tests over the same network. If comparing instances, make sure your instances are in the same area zone (AZ). Ie both instances in us-east-2a
- Run with private IP addresses. If you are using AWS public IPs there can be more variance associated with the network
- Beware running through a proxy or VPC. When using such methods, firewalls, and additional layers it can be difficult to know for sure what might be the bottleneck. Best to benchmark in a simple environment (within same vpc) and add the layers afterwards to make sure you are optimized.
- When comparing different machine instances ensure they are in the same AZ and tested as closely as possible in time. Network throughput throughout the day does change so performing tests close to eachother provides the most representative relative comparison.
- KeyDB is multithreaded. Ensure you specify multiple threads when running
Published at DZone with permission of Ben Schermel. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments