Data integration and processing is a huge challenge in Industrial IoT (IIoT, aka Industry 4.0 or Automation Industry) due to monolithic systems and proprietary protocols. Apache Kafka, its ecosystem (Kafka Connect, KSQL) and Apache PLC4X are a great open source choice to implement this IIoT integration end to end in a scalable, reliable and flexible way.
This blog post covers a high level overview about the challenges and a good, flexible architecture to solve the problems. At the end, I share a video recording and the corresponding slide deck. These provide many more details and insights.
Here are some of the key challenges in IIoT / Industry 4.0:
This is still state of the art in automation industry. This is no surprise with such long product life cycles, but still very concerning.
Today, everybody talks about cloud, big data analytics, machine learning and real time processing at scale. The convergence between IT and Automation Industry is coming, as the analyst report from IoT research company IOT Analytics shows:
There is huge demand to build an open, flexible, scalable platform. Many opportunities from business and technical perspective:
So, how to get from legacy technologies and proprietary IIoT protocols to cloud, big data, machine learning, real time processing? How to build a reliable, scalable and flexible architecture and infrastructure?
I assume you already know it: Apache Kafka is the De-facto Standard for Real-Time Event Streaming. It provides
If you need more details about Apache Kafka, check out the Kafka website, the extensive Confluent documentation or some free video recordings and slides from any Kafka Summit to learn about the technology and use cases.
The only very important thing I want to point out is that Apache Kafka includes Kafka Connect and Kafka Streams:
Kafka Connect enables reliable and scalable integration of Kafka with other systems. Kafka Streams allows to write standard Java apps and microservices to continuously process your data in real-time with a lightweight stream processing API. And finally, KSQL enables Stream Processing using SQL-like Semantics.
Apache PLC4X is less established on the market than Apache Kafka. It also “just covers a niche” (a big one, of course) compared to Kafka, which is used in any industry for many different use cases. However, PLC4X is a very interesting top level Apache project for automation industry.
The Goal is to open up PLC interfaces from IIoT world to the outside world. PCL4X allows vertical integration and to write software independent of PLCs using JDBC-like adapters for various protocols like Siemens S7, Modbus, Allen Bradley, Beckhoff ADS, OPC-UA, Emerson, Profinet, BACnet, Ethernet.
PLC4X provides a Kafka Connect connector. Therefore, you can leverage the benefits of Apache Kafka (high availability, high throughput, high scalability reliability, real time processing) to deploy PLC4X integration pipelines. With this, you can build one single architecture and infrastructure for
As Kafka decouples the producers from the consumers, you can consume the IIoT machine sensor data from any application – some might be real time, some might be batch, and some might be request-response communication for human interaction on a web or mobile app.
A little bit off-topic: How to choose between Apache PLC4X (open source framework for IIoT) and OPC-UA (open standard for IIoT). In short, both are different things and can also be complementary. Here is a comparison:
As you see, both have their pros and cons. To me, and this is clearly my subjective opinion, PLC4X provides a great alternatives with high flexibility and low footprint.
Many IoT Platform Solutions are available on the market. This includes products like Siemens MindSphere or Cisco Kinetic, and cloud services from the major cloud providers like AWS, GCP or Azure. And you have Kafka + PLC4X as you just learned above. Often, this is not a “neither … nor” decision:
You can either use
All alternatives have their pros and cons. There is no single solution which fits every use case! Therefore, no surprise that most IoT Solution Platforms provide Kafka source and sink connectors.
If you got curious about more details and insights, please check out my video recording and slide deck.
Click on the button to load the content from www.slideshare.net.
We are also building a nice and simple demo on Github these days:
PLC4X gets most exciting if you try it out by yourself and connect to your machines or tools. So, check out the example and adjust it to connect to your infrastructure.
Please let me know your feedback and questions about Kafka, its ecosystem and PLC4X for IIoT integration. Let’s also connect on LinkedIn to discuss interesting IIoT use cases and technologies in the future.
Siemens Healthineers, a global leader in medical technology, delivers solutions that improve patient outcomes and…
Discover my journey to achieving Lufthansa HON Circle (Miles & More) status in 2025. Learn…
Data streaming is a new software category. It has grown from niche adoption to becoming…
Apache Kafka and Apache Flink are leading open-source frameworks for data streaming that serve as…
This blog delves into Cardinal Health’s journey, exploring how its event-driven architecture and data streaming…
In the age of digitization, the concept of pricing is no longer fixed or manual.…
View Comments
Hi Kai, Do you know if PLC4X is suitable for production? The project still looks pretty immature, version 0.1 I think, and it's been that way for quite a long time.
Apache PLC4X is production-ready. Several companies use it in their OT/IT infrastructure. Having said that, it is "just" a framework. You need to operate and support it by yourself (or find a vendor or system integrator who helps).