History

Apache Kafka was originally developed by LinkedIn and moved to the Apache Software Foundation in 2012. In 2014 the Kafka developers left LinkedIn and founded a new company Confluent which is focused on developing and improving Apache Kafka and its surrounding eco system.

What is Apache Kafka?

Nowadays there are a lot of message broker available like RabbitMQ, ActiveMQ, MQTT and many more. Every message broker has its special area of operation. RabbitMQ for example allows to setup complex routing scenarios for efficient message delivery. What makes Apache Kafka special? Apache Kafka is described as “distributed event streaming platform”. That said, the focus lays on raw throughput, scalability as well as reliability. Messages or events (as they are called) are written to a distributed append-only log that is persisted to disk. Clients can choose where they begin reading from that log. Events can be stored for a very long period. In fact, if there is sufficient storage available there is no need to delete events. Likewise, Kafka clusters can be distributed and clustered across multiple servers for a higher degree of availability and fault tolerance.

When to use Kafka?

If you need some kind of direct stream processing with a high throughput then you should consider using Kafka. Let’s consider a scenario where you must track GPS coordinates from cell phones and cars. You need to handle millions or maybe billions of events every day.

null

Your customer wants to have real-time statistics and reports. As you can probably imagine with this tons of data every millisecond really matters. You need a fast and scalable system where you can add or remove application instances based on your needs.

null

Apache Kafka is a good foundation for these kinds of systems. There are a lot of clients for nearly every modern programming language. Starting with Java, .NET and Python as well as PHP, Erlang, Rust, Swift and many more. If Java is your world and you want to give Kafka a try, you can use frameworks like Spring-Kafka or Micronaut Kafka to get easy access.

TL;DR

Kafka is a messaging system that can handle tons of messages, fast and reliably. It is a distributed system, so you can scale it easily, and you can read messages from different servers. You can use it for real-time data processing, for example to track GPS coordinates from cell phones and cars. You can also use it to store messages for a long time.

Btw: this is how the davinci AI from OpenAI summarized the article

Title image from PIXNIO