Categories: EAI

Slides online: “Big Data beyond Apache Hadoop – How to Integrate ALL your Data” – JavaOne 2013

Slides from my session “Big Data beyond Apache Hadoop – How to Integrate ALL your Data” at JavaOne 2013 in San Francisco are online.

Abstract

Big data represents a significant paradigm shift in enterprise technology. Big data radically changes the nature of the data management profession as it introduces new concerns about the volume, velocity and variety of corporate data.

Apache Hadoop is the open source defacto standard for implementing big data solutions on the Java platform. Hadoop consists of its kernel, MapReduce, and the Hadoop Distributed Filesystem (HDFS). A challenging task is to send all data to Hadoop for processing and storage (and then get it back to your application later), because in practice data comes from many different applications (SAP, Salesforce, Siebel, etc.) and databases (File, SQL, NoSQL), uses different technologies and concepts for communication (e.g. HTTP, FTP, RMI, JMS), and consists of different data formats using CSV, XML, binary data, or other alternatives.

This session shows different open source frameworks and products (especially Apache Camel and Talend Open Studio for Big Data) to solve this challenging task. Learn how to use every thinkable data with Hadoop – without plenty of complex or redundant boilerplate code.

Slides

Click on the button to load the content from www.slideshare.net.

Load content

Kai Waehner

bridging the gap between technical innovation and business value for real-time data streaming, processing and analytics

Recent Posts

Fully Managed (SaaS) vs. Partially Managed (PaaS) Cloud Services for Data Streaming with Kafka and Flink

The cloud revolution has reshaped how businesses deploy and manage data streaming with solutions like…

57 mins ago

Apache Flink: Overkill for Simple, Stateless Stream Processing and ETL?

Discover when Apache Flink is the right tool for your stream processing needs. Explore its…

4 days ago

Virgin Australia’s Journey with Apache Kafka: Driving Innovation in the Airline Industry

Data streaming with Apache Kafka and Flink is transforming the airline industry, enabling real-time efficiency…

2 weeks ago

Stateless vs. Stateful Stream Processing with Kafka Streams and Apache Flink

The rise of stream processing has changed how we handle and act on data. While…

3 weeks ago

How Siemens Healthineers Leverages Data Streaming with Apache Kafka and Flink in Manufacturing and Healthcare

Siemens Healthineers, a global leader in medical technology, delivers solutions that improve patient outcomes and…

1 month ago

My Road to Lufthansa HON Circle Status in 2025

Discover my journey to achieving Lufthansa HON Circle (Miles & More) status in 2025. Learn…

1 month ago