Alex Gallego has been working in the streaming data space for more than 13 years. He first created an equivalent of Apache Fink optimized for low latency in C++ on top of Mesos. In 2016, he sold it to Akamai, now also the owner of Linode. At that time, he could not find a storage engine that could keep up with the volume of data that he was trying to build. That led to the creation of Redpanda.
“Today, almost a decade later, the bottleneck for storage engines has shifted from the spinning disk to CPU coordination problems. Redpanda was the reinvention of a wheel at the right time to build a Kafka replacement for mission-critical systems,” said Gallego, Founder and CEO of Redpanda.
Key highlights from this episode of Let’s Talk are:
- Gallego shares his data streaming backstory — how he could not find a storage engine to keep up with the volumes of data he was trying to build and how this led to Redpanda.
- As more businesses are moving from batch to real-time, Gallego explains that it was critical to build a new storage engine for the modern hardware from the ground up in C++ and that it spoke the same language so they could adopt classical migrations.
- Redpanda is being used in a variety of use cases, from space exploration where a satellite is being shipped to outer space to fraud detection.Gallego discusses how all the use cases have a huge back pressure of tiny events that you need to make sense of and extract value out of.
- In the past, big data has largely been driven by Java and the JVM is notoriously difficult to tune and optimize. Gallego feels that there were a lot of useful projects from Kafka and ZooKeeper, which have acted as inspiration for some of the advanced systems of today. The bottleneck for adoption is that these systems tend to be notoriously difficult to run and optimize, which is now driving the need to make them easy to use.
- Kafka and the other systems were designed in the era of datacenters with low latency and high bandwidth links. However, Gallego does not feel this is where the world is heading. Nowadays, you do a React application, deploy it to a serverless provider where it just runs, and that is where streaming needs to meet the developer today. He discusses how Redpanda is working to meet the developers’ needs and the ways they are thinking.
- People initially get started by doing real-time streaming, connecting an application, database, or machine learning framework which works fine. However, they will start to hit some fundamental limits of some of the technologies when they start to scale. One of the key challenges enterprises are facing is with storage being costly. Gallego explains how they are trying to solve this.