Google announced its Managed Service for Apache Kafka cloud service at its conference Google Cloud Next 2024 in Las Vegas. Welcome to the data streaming club joining Amazon, Microsoft, IBM, Oracle, Confluent, and others. This blog post explores this new managed Kafka offering for GCP, reviews the current status of the data streaming landscape, and shares some criteria to evaluate when Kafka in general and Google Apache Kafka in particular should (not) be used.
Better late than never… Google announced a brand new Apache Kafka cloud service for GCP at Google Cloud Next 2024. All other leading cloud providers already have one, including AWS, Azure, Oracle, IBM, and Alibaba. Various other software vendors provide Kafka services, including Confluent, Aiven, Redpanda, WarpStream, and many more. Most leverage the open source Kafka project as its core component, others re-implement the Kafka protocol.
Apache Kafka and Apache Flink dominate the open source data streaming ecosystem. Vendors and cloud solutions provide cloud-native offerings. Some developers, data engineers and business people still struggle with a paradigm shift: Continuous data processing enables better data quality, reduced cost, and faster time to market with innovative new applications. Kafka and Flink are a match made in heaven for data streaming.
Use Cases for data streaming exist across all industries. Google Managed Service for Apache Kafka is potentially a good fit for some of them, but not for others.
Data streaming represents a new software category that revolutionizes the way businesses harness and process data in real time. Unlike traditional batch processing methods, data streaming enables continuous ingestion, analysis, and processing of data as it flows through systems. I explored this topic in the past when many people wanted to put Apache Kafka and its vendors into the integration platform category.
Many software companies have emerged in the data streaming category in the last few years. And several mature players in the data market added support for data streaming in their platforms or cloud service ecosystem. Most software vendors use Kafka for their data streaming platforms. However, there is more than solutions powered by open source Kafka. Some vendors only use the Kafka protocol (e.g., Azure Event Hubs) or utterly different APIs (like Amazon Kinesis).
The following Data Streaming Landscape 2024 summarizes the current status of relevant products and cloud services for data streaming around Kafka and additional stream processing engines.
Apache Kafka became the de facto standard for data streaming, similar to how Amazon S3 became the de facto standard for object storage.
In December 2023, the research company Forrester published “The Forrester Wave™: Streaming Data Platforms, Q4 2023“. Get free access to the report here. The leaders are Microsoft, Google and Confluent, followed by Oracle, Amazon, Cloudera, and a few others.
In April 2024, IDC named Confluent a leader in the IDC MarketScape for Worldwide Analytic Stream Processing 2024.
It would not be a surprise if we see a Gartner Magic Quadrant for Data Streaming soon, too. Gartner reports mention Kafka and related vendors more and more year by year.
Qualifying out a technology is often the easier option. Why evaluate a service if it does not solve the requirements? Let’s explore when NOT to use Kafka at all, and specifically when the Google Apache Kafka service is probably NOT the right choice for you.
Apache Kafka has overlaps with technologies like a message broker (like IBM MQ, TIBCO or RabbitMQ), other streaming analytics platforms, and it actually is a database, too. But Apache Kafka is not an allrounder to solve every problem.
Apache Kafka is NOT:
Read the thorough analysis “When NOT to use Apache Kafka?” for more details. Or watch this YouTube video:
If Apache Kafka is the right choice for your project, you still have plenty of options.
Here are a few criteria that let you easily disqualify out Google Cloud Managed Service for Apache Kafka (MSK):
If you did not qualify out Kafka in general or Google Apache Kafka in particular yet, that’s great. Start evaluating Google Cloud’s Managed Service for Apache Kafka (MSK) service and compare it against self-managed open source Kafka and other semi-managed or fully-managed Kafka cloud services on GCP.
As we look ahead, the future possibilities for data streaming are boundless, promising more agile, intelligent, and real-time insights into the ever-increasing streams of data. I recently looked at the past, present and future of stream processing.
I often get the question if I am worried about the emerging competition as I work for Confluent where we “only do data streaming”?
No, I am not! Actually, the new Google Apache Kafka cloud service is great news for the industry! Data Streaming established itself as a new software category. Research analysts like Forrester and IDG already created dedicated waves and comparisons. What could be better than working with the people that invented Kafka and the company that created this software category across all industries and continents? And competition is always good for innovation, too.
Real-time data beats slow data. That’s true in almost every use case. At Confluent, we are now ~3000 people working only on one thing: Data Streaming. I think we should celebrate this Google announcement and look forward to more mass adoption of data streaming around the world.
And as a strategic Google partner, customers can
Are you excited about the new Google Cloud Managed Service for Apache Kafka (MSK) service? Or do you use still plan to use open source Kafka or another cloud service like Confluent Cloud? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.
Tesla’s Virtual Power Plant (VPP) turns thousands of home batteries, solar panels, and energy storage…
The financial industry is rapidly shifting toward real-time, intelligent, and seamlessly integrated services. From IoT…
Real-time data is no longer optional—it’s essential. Businesses across industries use data streaming to power…
Low-code/no-code tools have revolutionized software development and data engineering by providing visual interfaces that empower…
In today’s digital landscape, cybersecurity faces mounting challenges from sophisticated threats like ransomware, phishing, and…
The cloud revolution has reshaped how businesses deploy and manage data streaming with solutions like…