Ep. 115 Exploring Kafka with Kris Jenkins and Rob Walters
Today on the show we're talking about streaming data and streaming applications with Apache Kafka. We're joined by Kris Jenkins, Developer Advocate from Confluent, and Rob Walters, Product Manager at MongoDB, who will discuss how you can leverage this technology to your benefit and use it in your applications.
Kafka is traditionally used for building real time streaming data pipelines and real time streaming applications. It began its life in 2010 at LinkedIn and made its way to the public open-source space through a relationship with Apache, the Apache Foundation, in 2011. Since then, the use of Kafka has grown massively and it's estimated that approximately 30% of all Fortune 500 companies are already using Kafka in one way or another.
A great example for why you might want to use Kafka would be perhaps capturing all of the user activity that happens on your website. As users visit your website, they're interacting with links on the page and scrolling up and down. This is potentially large volumes of data. You may want to store this to understand how users are interacting with your website in real time. Kafka will aid in this process by ingesting and storing all of this activity data while serving up reads for applications on the other side.
Conversation highlights include:
- [03:38] What is Kafka?
- [05:29] At the heart of every database
- [08:03] The difference between Kafka and a database
- [09:03] What Kafka's architecture looks like
- [12:03] Kafka as a data backbone of system architecture
- [14:06] MongoDB and Kafka working together
- [15:40] What are "Topics" in Kafka?
- [17:53] Chain stream events
- [19:58] Kafka's history
- [22:07] MongoDB Connector, and Kafka via Confluent Cloud
- [25:53] Popular use cases using Kafka and MongoDB
- [27:48] Kafka and stream processing with games and event data
- [29:13] KSQL and processing against the stream of data
- [30:59] Developer.Confluence.io, a place to learn everything about Kafka