Categories: EAI

Slides from NoSQLmatters: “Big Data beyond Apache Hadoop – How to integrate ALL your data with Apache Camel and Talend”

Slides from my talk “Big Data beyond Apache Hadoop – How to integrate ALL your data” at NoSQLmatters 2013 in Cologne are online.

Here the abstract:

Big data represents a significant paradigm shift in enterprise technology. Big data radically changes the nature of the data management profession as it introduces new concerns about the volume, velocity and variety of corporate data.
Apache Hadoop is the open source defacto standard for implementing big data solutions on the Java platform. Hadoop consists of its kernel, MapReduce, and the Hadoop Distributed Filesystem (HDFS). A challenging task is to send all data to Hadoop for processing and storage (and then get it back to your application later), because in practice data comes from many different applications (SAP, Salesforce, Siebel, etc.) and databases (File, SQL, NoSQL), uses different technologies and concepts for communication (e.g. HTTP, FTP, RMI, JMS), and consists of different data formats using CSV, XML, binary data, or other alternatives.
This session shows different open source frameworks and tools to solve this challenging task. Learn how to use every thinkable data with Hadoop – without plenty of complex or redundant boilerplate code.

Here the slides:

Click on the button to load the content from www.slideshare.net.

Load content

Kai Waehner

bridging the gap between technical innovation and business value for real-time data streaming, processing and analytics

Next Big Data beyond Hadoop - How to integrate ALL data with Apache Camel and Talend => Video from NoSQL matters 2013 online »

Previous « Book Review: "Getting Started with NoSQL" by Gaurav Vaish (Packt Publishing)

Published by

Kai Waehner

Tags: ApacheBig DataClouderaEAIEnterprise Integration PatternEnterprise Service BusESBHadoopHortonworksIntegration FrameworkJavaMapRopen sourcetalend

12 years ago

Virta’s Electric Vehicle (EV) Charging Platform with Real-Time Data Streaming: Scalability for Large Charging Businesses

The rise of Electric Vehicles (EVs) demands a scalable, efficient charging network—but challenges like fluctuating…

7 hours ago

Apache Kafka

Apache Kafka 4.0: The Business Case for Scaling Data Streaming Enterprise-Wide

Apache Kafka 4.0 represents a major milestone in the evolution of real-time data infrastructure. Used…

3 days ago

Agentic AI

How Apache Kafka and Flink Power Event-Driven Agentic AI in Real Time

Agentic AI marks a major evolution in artificial intelligence—shifting from passive analytics to autonomous, goal-driven…

1 week ago

Shift Left Architecture

Shift Left Architecture at Siemens: Real-Time Innovation in Manufacturing and Logistics with Data Streaming

Industrial enterprises face increasing pressure to move faster, automate more, and adapt to constant change—without…

2 weeks ago

Data Streaming

The Importance of Focus: Why Software Vendors Should Specialize Instead of Doing Everything (Example: Data Streaming)

As real-time technologies reshape IT architectures, software vendors face a critical decision: specialize deeply in…

2 weeks ago

Batch Processing

The Top 20 Problems with Batch Processing (and How to Fix Them with Data Streaming)

Batch processing introduces delays, complexity, and data quality issues that modern businesses can no longer…

3 weeks ago