Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Services

In November 2016, I am at Big Data Spain in Madrid for the first time. A great conference with many awesome speakers and sessions about very hot topics such as Apache Hadoop, Spark Spark, Streaming Processing / Streaming Analytics and Machine Learning. If you are interested in big data, then this conference is for you! My two talks:

“How to Apply Machine Learning to Real Time Processing” (see slides and video recording from a similar conference talk).
“Comparison of Streaming Analytics Options” (the reason for this blog post; an updated version of my talk from JavaOne 2015)

Here I wanna share the slides and a video recording of the latter one…

Abstract: Comparison of Stream Processing Options

This session discusses the technical concepts of stream processing / streaming analytics and how it is related to big data, mobile, cloud and internet of things. Different use cases such as predictive fault management or fraud detection are used to show and compare alternative frameworks and products for stream processing and streaming analytics.

The focus of the session lies on comparing

different open source frameworks such as Apache Apex, Apache Flink or Apache Spark Streaming
engines from software vendors such as IBM InfoSphere Streams, TIBCO StreamBase
cloud offerings such as AWS Kinesis.
real time streaming UIs such as Striim, Zoomdata or TIBCO Live Datamart. Live demos will give the audience a good feeling about how to use these frameworks and tools.

The session will also discuss how stream processing is related to Apache Hadoop frameworks (such as MapReduce, Hive, Pig or Impala) and machine learning (such as R, Spark ML or H2O.ai).

Slides – Alternatives for Streaming Analytics

The following slide deck is a more extensive version of the talk at Big Data Spain (as the conference talks were only 30 minutes):

Click on the button to load the content from www.slideshare.net.

Load content

Video Recording: Apache Storm, Flink, Apex, Spark, StreamBase, Striim, et al

The video recording walks you through the above slide deck:

As always, I appreciate any comments, questions or other feedback.

Kai Waehner

bridging the gap between technical innovation and business value for real-time data streaming, processing and analytics

Next Comparison: Data Preparation vs. Inline Data Wrangling in Machine Learning and Deep Learning Projects »

Previous « Comparison of Open Source IoT Integration Frameworks

Virta’s Electric Vehicle (EV) Charging Platform with Real-Time Data Streaming: Scalability for Large Charging Businesses

The rise of Electric Vehicles (EVs) demands a scalable, efficient charging network—but challenges like fluctuating…

2 days ago

Apache Kafka

Apache Kafka 4.0: The Business Case for Scaling Data Streaming Enterprise-Wide

Apache Kafka 4.0 represents a major milestone in the evolution of real-time data infrastructure. Used…

5 days ago

Agentic AI

How Apache Kafka and Flink Power Event-Driven Agentic AI in Real Time

Agentic AI marks a major evolution in artificial intelligence—shifting from passive analytics to autonomous, goal-driven…

1 week ago

Shift Left Architecture

Shift Left Architecture at Siemens: Real-Time Innovation in Manufacturing and Logistics with Data Streaming

Industrial enterprises face increasing pressure to move faster, automate more, and adapt to constant change—without…

2 weeks ago

Data Streaming

The Importance of Focus: Why Software Vendors Should Specialize Instead of Doing Everything (Example: Data Streaming)

As real-time technologies reshape IT architectures, software vendors face a critical decision: specialize deeply in…

2 weeks ago

Batch Processing

The Top 20 Problems with Batch Processing (and How to Fix Them with Data Streaming)

Batch processing introduces delays, complexity, and data quality issues that modern businesses can no longer…

3 weeks ago