Categories: EAI

Slides from NoSQLmatters: “Big Data beyond Apache Hadoop – How to integrate ALL your data with Apache Camel and Talend”

Slides from my talk “Big Data beyond Apache Hadoop – How to integrate ALL your data” at NoSQLmatters 2013 in Cologne are online.

Here the abstract:

Big data represents a significant paradigm shift in enterprise technology. Big data radically changes the nature of the data management profession as it introduces new concerns about the volume, velocity and variety of corporate data.
Apache Hadoop is the open source defacto standard for implementing big data solutions on the Java platform. Hadoop consists of its kernel, MapReduce, and the Hadoop Distributed Filesystem (HDFS). A challenging task is to send all data to Hadoop for processing and storage (and then get it back to your application later), because in practice data comes from many different applications (SAP, Salesforce, Siebel, etc.) and databases (File, SQL, NoSQL), uses different technologies and concepts for communication (e.g. HTTP, FTP, RMI, JMS), and consists of different data formats using CSV, XML, binary data, or other alternatives.
This session shows different open source frameworks and tools to solve this challenging task. Learn how to use every thinkable data with Hadoop – without plenty of complex or redundant boilerplate code.

Here the slides:

Click on the button to load the content from www.slideshare.net.

Load content

Kai Waehner

builds cloud-native event streaming infrastructures for real-time data processing and analytics

View Comments

  • Ahaa, its nice conversation about this article here at this
    web site, I have read all that, so at this time me also commenting here.

  • We are a group of volunteers and starting a new schene in our
    community. Your web ste provided us with valuable information tto work on.
    You have done an impressive jjob and our entire community ill be thankful to you.

    Also visit my page DAMENKASACKS (https://www.mein-kasack.de/)

Recent Posts

Real-Time Locating System and Digital Twin in Smart Factories with Data Streaming at Daimler Truck

Technologies like Real-Time Locating Systems (RTLS) and Digital Twin are transforming manufacturing processes in the…

3 hours ago

Data Streaming with Apache Kafka at Daimler Truck for Industrial IoT and Cloud Use Cases

As a global leader in the commercial vehicle sector, Daimler Truck is not only committed…

3 days ago

A New Era in Dynamic Pricing: Real-Time Data Streaming with Apache Kafka and Flink

In the age of digitization, the concept of pricing is no longer fixed or manual.…

1 week ago

IoT and Data Streaming with Kafka for a Tolling Traffic System with Dynamic Pricing

In the rapidly evolving landscape of intelligent traffic systems, innovative software provides real-time processing capabilities,…

3 weeks ago

Fraud Prevention in Under 60 Seconds with Apache Kafka: How A Bank in Thailand is Leading the Charge

In the fast-paced world of finance, the ability to prevent fraud in real-time is not…

4 weeks ago

When to Choose Apache Kafka vs. Azure Event Hubs vs. Confluent Cloud for a Microsoft Fabric Lakehouse

Choosing between Apache Kafka, Azure Event Hubs, and Confluent Cloud for data streaming is critical…

1 month ago