Is Apache Kafka really real-time? This is a question I get asked every week. Real-time is a great marketing term to describe how businesses can add value by processing data as fast as possible. Most software and product vendors use it these days. Including messages frameworks (e.g., IBM MQ, RabbitMQ), event streaming platforms (e.g., Apache Kafka, Confluent), data warehouse/analytics vendors (e.g., Spark, Snowflake, Elasticsearch), and security / SIEM products (e.g., Splunk). This blog post explores what “real-time” really means and how Apache Kafka and other messaging frameworks accomplish the mission of providing real-time data processing.
The definition of the term “real-time” is not easy. However, it is essential to define it before you start any discussion about this topic.
In general, real-time computing (sometimes called reactive computing) is the computer science term for hardware and software systems subject to a “real-time constraint”, for example, from event to system response. Real-time programs must guarantee a response within specified time constraints, often referred to as “deadlines”. Real-time processing fails if not completed within a specified deadline relative to an event; deadlines must always be met, regardless of system load.
Unfortunately, there is more than one “real-time”. The graphic from embedded.com describes it very well:
Here are a few different nuances of real-time data processing:
From a more technical point of view, hard real-time is a synchronous push operation. The caller invokes something and must wait for the return. This cannot be implemented with event distribution effectively. It is rather an API call. Soft- and near real-time are asynchronous. The caller propagates an event but others do not affect the outcome.
Hermann Kopetz’s book “Real-Time Systems: Design Principles for Distributed Embedded Applications” is a great resource if you want to dig deeper. The Wikipedia article is also a good, detailed summary with further references.
Always question what is meant by the term “real-time” if the context is not clear yet. While it is not always accurate, it is okay to use the term “real-time” in many cases, especially when you talk to business people.
The use cases in the next sections will make different real-time scenarios more clear.
Hard real-time requires a deterministic network with zero latency and no spikes. Common scenarios include embedded systems, field bus and PLCs in manufacturing, cars, robots, etc. Time-Sensitive Networking (TSN) is the right keyword if you want to do more research. TSN is a set of standards under development by the Time-Sensitive Networking task group of the IEEE 802.1 working group.
This is NOT Java, NOT Cloud, and NOT anything else a web developer knows and uses for daily routine.
Here are a few examples that are only doable (and safe) with hard real-time:
Most hard real-time implementations are proprietary and have a long history. Nevertheless, the industry is getting more and more open. Industry 4.0, Industrial IoT, autonomous driving, smart cities, and similar scenarios are impossible without an open architecture.
Soft real-time or near real-time is what most people actually talk about when they say “real-time”. The use cases include everything that is not “hard real-time”. End-to-end communication has latency, delays, and spikes. Near real-time can be very fast, but also take a long time. Most use cases across all verticals sit in this category.
Some verticals such as retailing or gaming might never have to think about hard real-time at all. Even though, if you dig deeper, retailing also often have production lines. Gaming also has game consoles and hardware. Hence, it always depends on your business department.
Some examples of soft/near real-time use cases:
As you can see, “near real-time” can mean many different things. It is okay to say “real-time” to these use cases. Not just in marketing, but also in business and technical meetings! But make sure to understand your requirements and find the appropriate technologies.
I will focus on Apache Kafka as it established itself as the de facto standard for near real-time processing in the market (aka event streaming). However, Kafka is also often used as a messaging platform, and to ingest data into other analytics tools. Hence, it fits into most of the near real-time use cases.
Kafka is real-time. But not for everybody’s definition of real-time. Let’s understand this better…
Apache Kafka became the de facto standard for reliable data processing at scale in real-time. Most people agree with this in the IT world. Kafka provides capabilities to process trillions of events per day. Each Kafka broker (= server) can process tens of thousands of messages per second. End-to-end latency from producer to consumer can be as low as ~10ms if the hardware and network setup are good enough. Kafka is battle-tested at thousands of companies for hundreds of different use cases. It uses Apache 2.0 license and provides a huge community and ecosystem. So far so good…
However, in the OT world, things are different: Kafka is only soft real-time. Many OT applications require hard real-time. Hence, scenarios around automotive, manufacturing, and smart cities need to make this distinction. Consortiums and standards provide hard real-time frameworks and guidelines on how to integrate with them from the IT side. Two examples:
Most companies I have talked to in these industries combine hard real-time and soft real-time. Both approaches are complementary and have different trade-offs.
The following section shows how enterprises combine the OT world (cars, machines, PLCs, robots, etc.) with the IT world (analytics, reporting, business applications).
Kafka is not hard real-time, but most enterprises combine it with hard real-time applications to correlate the data, integrate with other systems in near real-time, and build innovative new business applications.
Some notes on the above architecture about the relation between Kafka and the OT world:
Let’s conclude this post with a specific example for combining hard real-time systems and near real-time using Apache Kafka: Cybersecurity and SIEM in the smart factory.
Most factories require hard real-time for their machines, PLCs, DCS, robots, etc. Unfortunately, many applications are 10, 20, 30 years, and older. They run on unsecured and unsupported operating systems (Windows XP is still far from going away in factories!)
I have seen a few customers leveraging Apache Kafka as a cybersecurity platform in the middle. I.e., between the monolithic, proprietary legacy systems and the modern IT world.
Kafka monitors all data communication in near real-time to implement access control, detect anomalies, and provide secure communication. This architecture enables the integration with non-connected legacy systems to collect sensor data but also ensures that no external system gets access to the unsecured machines. Intel is a great public example for building a modern, scalable cyber intelligence platform with Apache Kafka and Confluent Platform.
One common security design pattern in Industrial IoT is the data diode. Implementations often include a hardware/software combination such as the products from Owl Cyber Defense. Another option is the Kafka Connect based Data Diode Connector (Source and Sink) to build a Kafka-native high-security unidirectional network. In such networks, the network settings do not permit TCP/IP packets and UDP packets are only allowed in one direction.
Hard real-time is critical for some use cases, such as car engines, medical systems, and industrial process controllers. However, most other use cases only require near real-time. Apache Kafka comes into play to build scalable, reliable, near real-time applications and connect to the OT world. The open architecture and backpressure handling of huge volumes from IoT interfaces are two of the key reasons why Kafka is such a good fit in OT/IT architectures.
How do you use Kafka for (near) real-time applications? How is it combined with machines, PLCs, cars, and other hard real-time applications? What are your strategy and timeline? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.
Siemens Healthineers, a global leader in medical technology, delivers solutions that improve patient outcomes and…
Discover my journey to achieving Lufthansa HON Circle (Miles & More) status in 2025. Learn…
Data streaming is a new software category. It has grown from niche adoption to becoming…
Apache Kafka and Apache Flink are leading open-source frameworks for data streaming that serve as…
This blog delves into Cardinal Health’s journey, exploring how its event-driven architecture and data streaming…
In the age of digitization, the concept of pricing is no longer fixed or manual.…
View Comments