KSQL Deep Dive – The Open Source Streaming SQL Engine for Apache Kafka

I had a workshop at Kafka Meetup Tel Aviv in May 2018: “KSQL Deep Dive – The Open Source Streaming SQL Engine for Apache Kafka“.

Here are the agenda, slides and video recording.

KSQL – The Open Source Streaming SQL Engine for Apache Kafka

KSQL is the open-source, Apache 2.0 licensed streaming SQL engine on top of Apache Kafka which aims to simplify all this and make stream processing available to everyone. Even though it is simple to use, KSQL is built for mission-critical and scalable production deployments (using Kafka Streams under the hood).
Benefits of using KSQL include No coding required; no additional analytics cluster needed; streams and tables as first-class constructs; access to the rich Kafka ecosystem. This session introduces the concepts and architecture of KSQL. Use cases such as Streaming ETL, Real-Time Stream Monitoring or Anomaly Detection are discussed. A live demo shows how to setup and use KSQL quickly and easily on top of your Kafka ecosystem.

If you want to get started, try out the KSQL quick start guide. It get’s you started in 10min locally on your laptop or alternatively in a Docker environment.

Agenda

  1. Apache Kafka Ecosystem
  2. Kafka Streams as Foundation for KSQL
  3. Motivation for KSQL
  4. KSQL Concepts
  5. Live Demo #1 – Intro to KSQL
  6. KSQL Architecture
  7. Live Demo #2 – Clickstream Analysis
  8. Building a User Defined Function (Example: Machine Learning)
  9. Getting Started

Slides

Click on the button to load the content from www.slideshare.net.

Load content

Video Recording

There was a Youtube live stream. Unfortunately, we had some technical problems. So the audio of the first half is not really good. Sorry for that. I still want to share it. The second half has good sounds quality:

Looking forward to get your feedback. Also please feel free to ask questions in the Confluent Slack community (where you can also get help from the engineers of KSQL) or create Github tickets if you have problems or contributions to this great open source project.

Kai Waehner

bridging the gap between technical innovation and business value for real-time data streaming, processing and analytics

Recent Posts

Cybersecurity with a Digital Twin: Why Real-Time Data Streaming Matters

Cyberattacks on critical infrastructure and manufacturing are growing, with ransomware and manipulated sensor data creating…

6 days ago

How Siemens, SAP, and Confluent Shape the Future of AI Ready Integration – Highlights from the Rojo Event in Amsterdam

Many enterprises want to become AI ready but are limited by slow, batch based integration…

2 weeks ago

Scaling Kafka Consumers: Proxy vs. Client Library for High-Throughput Architectures

Apache Kafka’s pull-based model and decoupled architecture offer unmatched flexibility for event-driven systems. But as…

3 weeks ago

Square, SumUp, Shopify: Real-Time Point-of-Sale (POS) in the Age of Data Streaming

Point-of-Sale systems are evolving into real-time, connected platforms that go far beyond payments. Mobile solutions…

4 weeks ago

Online Feature Store for AI and Machine Learning with Apache Kafka and Flink

Real-time personalization requires more than just smart models. It demands fresh data, fast processing, and…

1 month ago

How Data Streaming Powers AI and Autonomous Networks in Telecom – Insights from TM Forum Innovate Americas

AI and autonomous networks took center stage at TM Forum Innovate Americas 2025 in Dallas.…

1 month ago