Review of “Benchmarks: Redpanda versus Kafka (Open House 2022)”
Apr 18, 2023
What is it about?
Data streaming platforms have become increasingly popular in recent years as more businesses look to leverage real-time data to gain insights and make more informed decisions. Two of the most popular data streaming platforms are Kafka and Redpanda, which are both designed to handle large-scale, high-throughput data streams.
In a recent presentation at Open House 2022, Alexander Gallego, the CEO of Vectorized, compared the performance of these two platforms using various benchmarks. The results showed that Redpanda outperformed Kafka in terms of throughput, latency, and CPU usage.
Throughput is a measure of how much data can be processed by a system in a given time. In the benchmark tests, Redpanda demonstrated a higher throughput than Kafka, with up to 50% more messages processed per second.
Latency, on the other hand, measures the time it takes for a message to be sent from the producer to the consumer. Redpanda again outperformed Kafka in this regard, with lower median and 99th percentile latencies in all test scenarios.
CPU usage is an important metric for any streaming platform as it directly impacts the cost and scalability of the system. The benchmark tests showed that Redpanda consumed significantly fewer CPU resources than Kafka, with up to 3 times lower CPU usage observed.
So, what makes Redpanda more performant than Kafka? According to Gallego, the key differences lie in the architecture and design of the two platforms. Redpanda is built on top of the Raft consensus protocol, which allows for faster leader election and replication of data. Additionally, Redpanda uses Linux kernel bypass techniques to reduce overhead and achieve higher throughput.
It's worth noting that Redpanda is still a relatively new player in the data streaming market, having only been released in 2020. As such, it may not yet have the same level of community support and tooling as Kafka. However, the benchmark tests suggest that Redpanda is a viable alternative for organizations that require high-performance, low-latency data streaming.
The performance benchmarks presented by Gallego demonstrate that Redpanda outperforms Kafka in terms of throughput, latency, and CPU usage. While Redpanda may not have the same level of community support and tooling as Kafka, its performance advantages make it a compelling option for businesses looking to leverage real-time data.