Apache Iceberg Open Table Format for Data Lake Lakehouse Streaming wtih Kafka Flink Databricks Snowflake AWS GCP Azure
Read More

Apache Iceberg – The Open Table Format for Lakehouse AND Data Streaming

An open table format framework like Apache Iceberg is essential in the enterprise architecture to ensure reliable data management and sharing, seamless schema evolution, efficient handling of large-scale datasets and cost-efficient storage. This blog post explores market trends, adoption of table format frameworks like Iceberg, Hudi, Paimon, Delta Lake and XTable, and the product strategy of leading vendors of data platforms such as Snowflake, Databricks (Apache Spark), Confluent (Apache Kafka / Flink), Amazon Athena and Google BigQuery.
Read More
Kappa Architecture vs Lambda Architecture for Apache Kafka Pulsar Data Lakes
Read More

Kappa Architecture is Mainstream Replacing Lambda

Real-time data beats slow data. That’s true for almost every use case. Nevertheless, enterprise architects build new infrastructures with the Lambda architecture that includes separate batch and real-time layers. This blog post explores why a single real-time pipeline, called Kappa architecture, is the better fit. Real-world examples from companies such as Disney, Shopify, Uber, and Twitter explore the benefits of Kappa but also show how batch processing fits into this discussion positively without the need for Lambda.
Read More

Kafka Streams + H2O.ai + TensorFlow (Video Recording / Live Demo)

I do a lot of presentations these days at meetups and conferences about how to leverage Apache Kafka and Kafka Streams to apply analytic models (built with H2O, TensorFlow, DeepLearning4J and other frameworks) to scalable, mission-critical environments. As many attendees have asked me, I created a video recording about this talk (focusing on live demos).
Read More

Why I Move (Back) to Open Source for Messaging, Integration and Stream Processing

After three great years at TIBCO Software, I move back to open source and join Confluent, the company behind the open source project Apache Kafka to build mission-critical, scalable infrastructures for messaging, integration and stream processsing. In this blog post, I want to share why I see the future for middleware and big data analytics in open source technologies, why I really like Confluent, what I will focus on in the next months, and why I am so excited about this next step in my career.
Read More