Apache Kafka is the de facto standard for data streaming to process data in motion. With its significant adoption growth across all industries, I get a very valid question every week: When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job? This blog post contains a lightboard video that gives you a twenty-minute explanation of the DOs and DONTs.
DisclaimeR: This blog post shares a lightboard video to watch an explanation about when NOT to use Apache Kafka. For a much more detailed and technical blog post with various use cases and case studies, check out this blog post from 2022 (which is still valid today – whenever you read it).
Kafka is often misunderstood. For instance, I still hear way too often that Kafka is a message queue. Part of the reason is that some vendors only pitch it for a specific problem (such as data ingestion into a data lake or data warehouse) to sell their products. So, in short:
This combination of characteristics in a single platform makes Kafka unique (and successful).
For these reasons, Kafka is complementary, not competitive, to these other technologies. Choose the right tool for the job and combine them!
The following video explores the key concepts of Apache Kafka. Afterwards, the DOs and DONTs of Kafka show how to complement data streaming with other technologies for analytics, APIs, IoT, and other scenarios.
The research company Forrester defines data streaming platforms as a new software category in a new Forrester Wave. Apache Kafka is the de facto standard used by over 100,000 organizations.
Plenty of vendors offer Kafka platforms and cloud services. Many complementary open source stream processing frameworks like Apache Flink and related cloud offerings emerged. And competitive technologies like Pulsar, Redpanda, or WarpStream try to get market share leveraging the Kafka protocol. Check out the data streaming landscape of 2024 to summarize existing solutions and market trends. The end of the article gives an outlook to potential new entrants in 2025.
Over 150,000 organizations use Apache Kafka in the meantime. The Kafka protocol is the de facto standard for many open source frameworks, commercial products and serverless cloud SaaS offerings.
However, Kafka is not an allrounder for every use case. Many projects combine Kafka with other technologies, such as databases, data lakes, data warehouses, IoT platforms, and so on. Additionally, Apache Flink is becoming the de facto standard for stream processing (but Kafka Streams is not going away and is the better choice for specific use cases).
Where do you (not) use Apache Kafka? What other technologies do you combine Kafka with? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.
In the age of digitization, the concept of pricing is no longer fixed or manual.…
In the rapidly evolving landscape of intelligent traffic systems, innovative software provides real-time processing capabilities,…
In the fast-paced world of finance, the ability to prevent fraud in real-time is not…
Choosing between Apache Kafka, Azure Event Hubs, and Confluent Cloud for data streaming is critical…
In today's data-driven world, understanding data at rest versus data in motion is crucial for…
If you ask your favorite large language model, Microsoft Fabric appears to be the ultimate…