Request-response communication with REST / HTTP is simple, well understood, and supported by most technologies, products, and SaaS cloud services. Contrarily, data streaming with Apache Kafka is a fundamental change to process data continuously. HTTP and Kafka complement each other in various ways. This post explores the architectures and use cases to leverage request-response together with data streaming in the control plane for management or in the data plane for producing and consuming events.
Prior to discussing the relationship between HTTP/REST and Apache Kafka, let’s explore the concepts behind both. Traditionally, request-response and data streaming are two different paradigms.
The following characteristics make HTTP so prevalent in software engineering for request-response (aka request-reply) communication:
HTTP is about communication between two applications. On the contrary, data streaming is much more than just data communication between a client and a server. Hence, data streaming platforms like Apache Kafka have very different characteristics:
Please note that this article specifically discusses Apache Kafka as it is the established de facto standard for data streaming. It powers most data streaming distributions like Confluent, Red Hat, IBM, Cloudera, TIBCO, and many more, plus cloud services like Confluent Cloud and Amazon MSK. Nevertheless, other frameworks and cloud services like Apache Pulsar, Redpanda, Apache Flink, AWS Kinesis, and many other data streaming technologies follow the same principles. Just be aware of the technical differences and trade-offs between data streaming products.
Most architectures need request-response for point-to-point communication (e.g., between a server and mobile app) and data streaming for continuous event processing.
Event sourcing with CQRS is the better design for most data streaming scenarios. However, developers can implement the request-response message exchange pattern natively with Apache Kafka.
Nevertheless, direct HTTP communication with Kafka is the easier and often better approach for appropriate use cases. With this in mind, let’s look at use cases where HTTP is used with Kafka and how they complement each other.
Most developers and administrators are familiar with REST APIs. They are the natural option for many best practices and security guidelines. Here are some good reasons why this will not change in the future:
Other great implementations exist for request-response communication. For instance:
Nevertheless, HTTP is the first choice in most projects. gRPC, GraphQL, or other implementations are chosen for specific problems if HTTP is not good enough.
RESTful interfaces to an Apache Kafka cluster make it easy to produce and consume messages, view the cluster’s metadata, and perform administrative actions using standard HTTP(S) instead of the native TCP-based Kafka protocol or clients.
Each scenario differs significantly in its purpose. Some use cases are implemented out of convenience, while others are required because of technical specifications.
There are two major categories of use cases for combining HTTP with Kafka. In terms of cloud-native architectures, this can be divided into the management plane (i.e., administration and configuration) and the data plane (i.e., producing and consuming data).
The management and administration of a Kafka cluster involve various tasks, such as:
Many scenarios require or prefer the usage of REST APIs for producing and consuming messages to/from Kafka, such as
The Confluent REST Proxy has been around for a long time and is available under the Confluent Community License. Many companies use it in production as a management plane and data plane as a self-managed component in conjunction with open source Apache Kafka, Confluent Platform, or Confluent Cloud.
While not being a lawyer, the short version is that you can use the Confluent REST Proxy for free – even for your production workloads at any scale – as long as you don’t build a competitive cloud service with it (say, e.g., the “AWS Kafka HTTP Proxy”) and charge per hour or volume for the serverless offering.
The Confluent REST Proxy and REST APIs are separated into both a data plane and a management plane:
While some applications require both, in many scenarios, only one or the other is used.
The management plane is typically used for very low throughout and a few API calls. The data plane, on the other hand, varies. Many applications produce and consume data continuously. The biggest limitation of the REST Proxy data plane is that it is a synchronous request-response protocol.
Don’t underestimate the power of the REST Proxy as a data plane because Kafka provides batch capabilities to scale up to many parallel REST Proxy instances. There are deployments where four REST Proxy instances can handle ~20,000 events per second, which is sufficient for many use cases.
Many HTTP use cases do not require millions of events per second. Hence, the Confluent REST Proxy is often good enough. This is even true for many IoT use cases I have seen in the wild where devices or machines connect to Kafka via HTTP.
Please note that chunked transfer encoding is a streaming data transfer mechanism available in version 1.1 of HTTP. In chunked transfer encoding, the data stream is divided into a series of non-overlapping “chunks”. The chunks are sent out and received independently of one another.
Some Kafka REST Produce APIs support a streaming mode that allows sending multiple records over a single stream. The stream remains open unless explicitly terminated. The streaming mode can be achieved by setting an additional header “Transfer-Encoding: chunked” on the initial request. Check if your favorite Kafka proxy or cloud API supports the HTTP streaming mode.
Different deployment options exist for the Confluent REST Proxy:
The self-managed REST Proxy instance or cluster of instances (as a “dedicated node”) is still decoupled from the open-open source Kafka broker or commercial Confluent Server. This is the ideal option for a data plane to produce and consume messages.
The management plane is also embedded as a unified REST API into Confluent Server (as a “broker plugin”) and Confluent Cloud for administrative operations. This simplifies the architecture because no additional nodes are required for using the administration APIs.
In some deployments, both approaches may be combined: The management plane is used via the embedded REST APIs in Confluent Server or in Confluent Cloud. Meanwhile, data plane use cases are decoupled into their own REST Proxy instances to easily handle scalability and be independent of the server side.
The developer does not have to care about the infrastructure or architecture for REST APIs in Confluent Cloud. The HTTP interfaces are fully managed by the vendor.
The REST APIs of the self-managed REST Proxy and Confluent Cloud are compatible. Hybrid architectures and cloud migration are possible without implementing any breaking changes.
Data governance is an important part of most data streaming projects. Kafka deployments usually include various decoupled producers and consumers, often following the DDD principle for microservice architectures. Hence, Confluent Schema Registry is used in most projects for schema enforcement and versioning.
Any Kafka client built by Confluent can leverage the Schema Registry using Avro, Protobuf, or JSON Schema. This includes programming APIs like Java, Python, Go, or Python, but also Kafka Connect sources and sink, Kafka Streams, ksqlDB, and the Confluent REST Proxy. Like the REST Proxy, Schema Registry is available under the Confluent Community License.
Schema Registry lives separately from your Kafka brokers. Confluent REST Proxy still talks to Kafka to publish and read data (messages) to topics. Concurrently, the REST Proxy can also talk to Schema Registry to send and retrieve schemas that describe the data models for the messages.
Schema Registry provides a serving layer for your metadata and enables data governance and schema enforcement for all events. It provides a RESTful interface for storing and retrieving your Avro, JSON Schema, and Protobuf schemas. The Schema Registry stores a versioned history of all schemas based on a specified subject name strategy, provides multiple compatibility settings, and allows the evolution of schemas according to the configured compatibility settings and expanded support for these schema types. It provides serializers that plug into Kafka clients that handle schema storage and retrieval for Kafka messages that are sent in any of the supported formats:
Schema enforcement happens on the client side. Additionally, Confluent Platform and Confluent Cloud provide server-side schema validation. The latter is helpful if incorrect or malicious client applications send messages to Kafka without using the client-side Schema Registry integration.
API Management is a term from the Open API world that puts a layer on top of HTTP / REST APIs for the management, monitoring, and monetization of APIs. Solutions include Apigee, MuleSoft Anypoint, Kong, IBM API Connect, TIBCO Mashery, and many more.
API Gateway and API Management Tools provide many outstanding features:
These features are unavailable in any data streaming platform like Kafka, Pulsar, Kinesis, et al. on the other side, API tools like MuleSoft or Kong are not built for processing real-time data at scale with low latency.
Hence, API Management and data streaming are complementary, not competitive! The blog post “Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?” explores this:
API == REST/HTTP for most API Management products and related API gateways. Vendors start to integrate either the Kafka API or event standards like AsyncAPI to get into event-based architectures. That’s great news!
Data sharing becomes crucial to modern and flexible enterprise architectures that build on concepts like microservices and data mesh. Real-time data beats slow data. That’s not just true for applications but also for data replication across business units, organizations, B2B, clouds, hybrid environments, and other scenarios. Therefore, the next generation of data sharing is built on top of data streaming.
HTTP APIs make little sense in many data streaming scenarios, especially if you expect high volumes or require low latency. Hence, data sharing in real-time by linking Kafka clusters or using a stream data exchange is the much better approach:
I won’t go into more detail here. The dedicated blog post “Streaming Data Exchange with Kafka for Real-Time Data Sharing” explores the idea, its trade-offs, and some real-world examples.
Various use cases employ HTTP/REST with Apache Kafka as a management plane or data plane. This combination will not go away in the future.
The Confluent REST Proxy can be used for HTTP(S) communication with your favorite client interface. No matter if you run open-source Apache Kafka, Confluent Platform, or Confluent Cloud. Check out the source code on GitHub or get started with an intuitive tutorial.
How do you combine data streaming with request-response principles? How do you combine HTTP and Kafka? What proxy are you using? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.
Discover when Apache Flink is the right tool for your stream processing needs. Explore its…
Data streaming with Apache Kafka and Flink is transforming the airline industry, enabling real-time efficiency…
The rise of stream processing has changed how we handle and act on data. While…
Siemens Healthineers, a global leader in medical technology, delivers solutions that improve patient outcomes and…
Discover my journey to achieving Lufthansa HON Circle (Miles & More) status in 2025. Learn…
Data streaming is a new software category. It has grown from niche adoption to becoming…