SAP is the leading ERP solution across industries around the world. Data integration with other data platforms, applications, databases, and APIs is one of the hardest challenges in the IT and software landscape. This blog post explores how SAP Datasphere in conjunction with the data streaming platform Apache Kafka enables a reliable, scalable and open data fabric for connecting SAP business objects of ECC and S/4HANA ERP with other real-time, batch, or request-response interfaces.
SAP is a German multinational software corporation that develops enterprise software to manage business operations and customer relations. SAP is best known for its ERP (Enterprise Resource Planning) software, which helps organizations integrate and streamline their business processes.
A wide range of industries and companies of all sizes use it. SAP ERP is one of the most widely used ERP solutions globally. SAP is not a single product, like many people think. Over the years, SAP has expanded its product portfolio. It includes cloud-based solutions, analytics, database management, and other enterprise software applications.
SAP offers a range of ERP products that cater to different business needs and industries. Some of the key SAP ERP products include:
This product list might be out of date when you read it. SAP continuously develops its product offerings. Products get new names from time to time, consolidate, or deprecate. In other words, SAP modernization, integration, and migration are usually an ongoing effort that never ends.
SAP Datasphere is the next generation of SAP Data Warehouse Cloud. The platform provides a comprehensive data service that enables data professionals to deliver seamless and scalable access to critical business data.
SAP Datasphere is a cloud-based product packaged within SAP’s Business Technology Platform (BTP). Datasphere brings together two previously standalone products, SAP Data Intelligence Cloud (DIC) and SAP Data Warehouse Cloud (DWC), into one cloud native, data integration, and data management platform. The solution allows SAP customers to ingest, integrate, store, and analyze core SAP ERP data, as well as to share this data with other analytical services and downstream applications.
Datasphere is the core part of a new solution, known as Business Data Fabric, to simplify data integration and management involving SAP ERP backend data. A key focus of SAP Datasphere is business intelligence and analytics.
I see Datasphere similar to Snowflake or Databricks as a general data warehouse / data lake / lakehouse, but focusing on SAP data with deep integration into the SAP ERP ecosystem and surrounding applications.
However, the out-of-the-box availability of SAP ERP data from SAP ECC, S/4HANA, and other SAP apps enables a simple but powerful opportunity for data integration beyond the SAP landscape. No need to use legacy SAP protocols like BAPI or IDoc anymore. Instead, SAP Datasphere provides a unified way to discover, connect, and manage data across different data sources, systems, and landscapes.
The key features of SAP Datasphere include:
This sounds like any other data management platform, doesn’t it?
But the above features are focusing mainly on SAP environments. Therefore, Datasphere has a few strategic software partnerships:
This emphasizes the strength of Datasphere around the SAP ecosystem. The other partners connect non-SAP IT infrastructure and applications with SAP environments bidirectionally.
SAP Datasphere is more than just an analytical platform for SAP ERP data.
Datasphere leverages SAP internal tooling to access data directly from SAP systems. It is a complete data integration and analytics solution optimized for collecting and preparing data from all SAP ERP systems of multiple generations. For the first time in their history, SAP is making core ERP data from numerous back-end systems available in a one-stop-shop fashion through Datasphere.
This brings us to the excellent opportunity of combining SAP business objects with Apache Kafka and the rest of the enterprise architecture.
Apache Kafka is a distributed streaming platform that has gained widespread popularity for its ability to handle large-scale, real-time data streaming and event processing. When it comes to SAP integration, there are several reasons organizations choose to use Apache Kafka:
An ERP environment should be real-time, scalable, and open. SAP ERP is not just one product or technology. And organizations always combine it with other open source frameworks, proprietary standard software, and SaaS. “Building a Postmodern ERP with Apache Kafka” explores how SAP ERP and other technologies provide the most value together in a flexible, open environment. Many next-generation ERP systems use Kafka under the hood, too. Even if you don’t see it because it is a proprietary product or SaaS. But event-driven architectures are helpful for software products as they are for any other software projects.
Integration between SAP ERP and other applications is crucial. Another kind of project is the migration and ERP modernization, e.g., from SAP ECC to S/4HANA or the migration between SAP and another software vendor.
A SAP migration project involves moving an SAP system or landscape from one environment to another. This could include moving from an on-premises environment to the cloud, upgrading to a newer version of SAP software, or consolidating multiple SAP instances. The exact steps and considerations for a SAP migration can vary based on the specific migration scenario.
Most SAP ERP migrations these days are from SAP ECC to SAP S/4Hana. These projects usually take years. Apache Kafka can provide valuable help in different SAP integration and migration scenarios.
The combination of real-time capabilities, an event storage for true decoupling and data consistency across real-time and non-real-time systems, and data integration with non-SAP systems and APIs make Kafka the perfect middleware for SAP modernization and ERP migrations.
I covered such a migration via Apache Kafka in a data warehouse modernization story where legacy and modern applications live in parallel for some months or even years until the final cutover is done.
Until the completion of the S/4Hana migration in the cloud, SAP ECC on-premise continues to exist for years. The hybrid deployment and synchronization capabilities of Kafka make it unique for SAP migration and modernization projects.
Data streaming defines a new software category. Confluent leads the data streaming industry. It provides a serverless cloud offering on all major public clouds and an offering for self-managed deployments powered by Apache Kafka and Flink. In December 2013, the research company Forrester published “The Forrester Wave™: Streaming Data Platforms, Q4 2023“. Get free access to the report here. The report explores what Confluent and other vendors like AWS, Microsoft, Google, Oracle and Clouders provide.
Confluent is now available in the SAP® Store, the online marketplace for SAP and partner offerings. The data streaming platform integrates with SAP Datasphere. The combination delivers a secure, governed solution for accessing SAP data as fully managed data streams for customers.
Confluent provides businesses that use SAP solutions with a cloud-native and complete data streaming platform available everywhere it’s needed – in the cloud, across clouds, on-premises, and hybrid environments. Configured directly within SAP Datasphere, the new Confluent integration allows businesses to:
Confluent is a partner in the SAP PartnerEdge program. The SAP PartnerEdge program provides the enablement tools, benefits and support to facilitate building high-quality, innovative applications focused on specific business needs – quickly and cost-effective.
Here is an example architecture connecting SAP ERP and non-SAP applications (Flink and Snowflake in this example) with Datasphere and Confluent:
Confluent and SAP Datasphere are the perfect combination for building a data fabric for all enterprise data. Like many companies leverage Apache Kafka as data fabric for AI and Machine Learning.
Is SAP Datasphere the new silver bullet for SAP ERP integration scenarios? No! As you learned in the above sections, Datasphere enables easy access to old and new SAP ERP data objects. However, Datasphere might have some drawbacks, too:
One solution does not fit all integration use cases. Know the different options and make your evaluation.
Plenty of other options exist for SAP-Kafka integration. I explored tens of APIs, tools, and connectors for data integration between SAP ERP and Apache Kafka.
For instance, look at the Confluent Hub and search for SAP Kafka integration. You will find many mature, lightweight and innovative solutions from various vendors. For instance, INIT, Asapio, Advantco, KaTe, Onibex, and Qlik provide integrations via different open and proprietary SAP interfaces like ODB, OData, REST, BAPI, or iDoc.
It was never easier to integrate the SAP ecosystem with the rest of the IT world in an enterprise architecture. SAP Datasphere supports straightforward access to SAP S/4 HANA, SAP BW/4HANA, SAP BW, SAP ECC, and SAP HANA ERP data without the need for complex integration projects. In addition, SAP supports connectivity to Business Warehouse, SAP’s on-premise data warehouse solution.
Apache Kafka enables data consistency across SAP and non-SAP applications across the data center and public cloud. No matter if the data source or sink is real time, near-real-time, batch, file-based, or a rest-response API like HTTP/REST. The heart of the data fabric is event-based, scalable, and reliable.
Confluent is the leading vendor of data streaming technologies like Apache Kafka. The strategic partnership and deep product integration between SAP Datasphere and Confluent provides an excellent opportunity for any organization that needs to integrate SAP and the rest of the IT infrastructure.
Some people might tell you how great Kafka is for analytical use cases. But not suited for operational, critical use cases (because some folks want to pitch another product for SAP integrations). That’s not accurate. Apache Kafka supports analytical AND transactional workloads. Actually, almost all customers I work with around the world use Confluent for transactional data from the SAP ERP for orders, payments, fraud detection, and similar operational use cases.
How do you integrate with your SAP systems today? Do you already use modern technologies like Apache Kafka? What connectors or solutions do you use? Will you use SAP Datasphere in the future? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.
In the age of digitization, the concept of pricing is no longer fixed or manual.…
In the rapidly evolving landscape of intelligent traffic systems, innovative software provides real-time processing capabilities,…
In the fast-paced world of finance, the ability to prevent fraud in real-time is not…
Choosing between Apache Kafka, Azure Event Hubs, and Confluent Cloud for data streaming is critical…
In today's data-driven world, understanding data at rest versus data in motion is crucial for…
If you ask your favorite large language model, Microsoft Fabric appears to be the ultimate…