NoSQL Matters Conference 2014
NoSQL Matters is a great conference about different NoSQL topics. A lot of great NoSQL products and use cases are presented. In November 2014, I had a talk about “Real World Use Cases and Success Stories for In-Memory Data Grids” in Barcelona, Spain. I discussed several different use cases, which our TIBCO customers implemented using our In-Memory Data Grid “TIBCO ActiveSpaces“. I will present the same content at data2day, a German conference in Karlsruhe about big data topics.
In-Memory Data Grids: TIBCO ActiveSpaces, Oracle Coherence, Infinispan, IBM eXtreme Scale, Hazelcast, Gigaspaces, etc.
A lot of in-memory data grid products are available. TIBCO ActiveSpaces, Oracle Coherence, Infinispan, IBM WebSphere eXtreme Scale, Hazelcast, Gigaspaces, GridGain, Pivotal Gemfire to name most of the important ones. See a great graphic by 451 Research Group, which shows different databases and how data grids fit into that landscape. You can always get the newest version: 451 DataBase Landscape.
It is important to understand that an in-memory data grid offers much more than just caching and storing data in memory. Further in-memory features are event processing, publish / subscribe, ACID transactions, continuous queries and fault-tolerance – to name a few… Therefore, let’s discuss one example in the next section to get a better understanding of what an in-memory data grid actually is.
TIBCO ActivesSpaces In-Memory Data Grid
TIBCO ActiveSpaces combines the best out of NoSQL and In-Memory features. The following description is taken from TIBCO’s website:
To lift the burden of big data, TIBCO ActiveSpaces provides a distributed in-memory data grid that can increase processing speed so you can reduce reliance on costly transactional systems.
ActiveSpaces EE provides an infrastructure for building highly scalable, fault-tolerant applications. It creates large virtual data caches from the aggregate memory of participating nodes, scaling automatically as nodes join and leave. Combining the features and performance of databases, caching systems, and messaging software, it supports very large, highly volatile data sets and event-driven applications. And it frees developers to focus on business logic rather than on the complexities of distributing, scaling, and making applications autonomously fault-tolerant.
ActiveSpaces EE supplies configurable replication of virtual shared memory. This means that the space autonomously re-replicates and re-distributes lost data, resulting in an active-active fault-tolerant architecture without resource overhead.
Benefits
- Reduce Management Cost: Off-load slow, expensive, and hard-to-maintain transactional systems.
- Deliver Ultra-Low, Predictable Latency: Use peer-to-peer communication, avoiding intervention by a central server.
- Drastically Improve Performance: Create next-generation elastic applications including high performance computing, extreme transaction processing, and complex event processing.
- Simplify Administration: Eliminate the complexity of implementing and configuring a distributed caching platform using a command-line administration tool with shell-like control keys that provide command history, syntax completion, and context-sensitive help.
- Become Platform Independent: Store database rows and objects and use the system as middleware to exchange information between heterogeneous platforms.
- Speed Development: Enable data virtualization and let developers focus on business logic rather than on the details of data implementation.
If you want to learn more about TIBCO ActiveSpaces take a look at a great recording from QCon 2013: TIBCO Fellow Jean-Noel Moyne discusses in-memory data grids in more detail.
SAP HANA is not an In-Memory Data Grid
I should write an additional blog post about this topic. Nevertheless, to make it clear: SAP HANA is not an in-memory data grid. This is important to mention as everybody thinks about SAP HANA when talking about in-memory, right? Take a look at the 451 database landscape, which I mentioned above. SAP HANA is put into the “relational zone” under appliances (SAP HANA is only available as appliance), whereas all the other products I named are put in the “Grid / Cache Zone”.
SAP Hana is primarily being used to reduce dependency on other relational databases (e.g. Oracle). It is designed to make SAP run faster not to speed up other applications (non-SAP). SAP HANA is more like a traditional DB that is meant to ‘run reports faster’ by leveraging the large amount of RAM on the servers. It is great for some analytical use cases, e.g. faster reporting and “after the fact analysis”.
Compared to other in-memory products (i.e. “real data grids”) such as TIBCO ActiveSpaces and the other products mentioned above, SAP HANA misses several features such as implicit eventing (publish / subscribe) or deployment with flexible elasticity on commodity hardware. You can implement custom logic on SAP HANA with JavaScript or a proprietary SQL-like language (SQLScript), of course. Though, building several of the use cases in my presentation below is much more difficult with SAP HANA than with other “real data grid” products.
Be aware: I am not saying that SAP HANA is a bad product. Though, it serves different use cases than in-memory data grids such as TIBCO ActiveSpaces! For example, SAP HANA is great to replace Oracle RACs as database backend for SAP ERPs to speed up the systems and improve user experience.
Real World Use Cases and Success Stories for In-Memory Data Grids
The goal of my talk was not very technical. Instead, I discussed several different real world use cases and success stories for using in-memory data grids. Here is the abstract for my talk:
NoSQL is not just about different storage alternatives such as document store, key value store, graphs or column-based databases. The hardware is also getting much more important. Besides common disks and SSDs, enterprises begin to use in-memory storages more and more because a distributed in-memory data grid provides very fast data access and update. While its performance will vary depending on multiple factors, it is not uncommon to be 100 times faster than corresponding database implementations. For this reason and others described in this session, in-memory computing is a great solution for lifting the burden of big data, reducing reliance on costly transactional systems, and building highly scalable, fault-tolerant applications.The session begins with a short introduction to in-memory computing. Afterwards, different frameworks and product alternatives are discussed for implementing in-memory solutions. Finally, the main part of this session shows several different real world uses cases where in-memory computing delivers business value by supercharging the infrastructure.
Here is the slide deck:
Click on the button to load the content from www.slideshare.net.
As always, I appreciate every feedback. Please post a comment or contact me via Email, Twitter, LinkedIn or Xing…