Friday, November 26, 2010

In-Memory Data Grid (IMDG) for Linear Horizontal Scalability & Extreme Transaction Processing (XTP)

IMDG products offers the capability to handle transactions in-memory (so faster) & facilitates creating a data grid (so linear scalability) for managing extreme processing (XTP) needs.
There are both commercial & open-source products available in market but before deciding upon the product or even using IMDG as technology, I would recommend considering following design/architectural points first:
1. Your needs: First & foremost, like any other solution, this is the most important factor to determine whether there is a need for such product or not. License cost for commercial product can cost a lost (~ $ 2500/- per processor for enterprise edition) and hence cost-benefit needs to be assessed first. I don’t see the need of it if there is no requirement for XTP (e.g. not needed for < 200 TPS).
2. Parallel Computing:
A distributed grid can offer processing ability similar to a mainframe processing utilizing cumulative capacity of the nodes in the grid. Processing can be seamlessly distributed across available nodes facilitating “parallel query” execution for faster response.
3. Caching Needs:
All the caching needs can be fulfilled using IMDG products and they offer support to all types of caches, e.g. distribute cache, replicated cache, partitioned cache, local cache with distribute as backup cache. But if you have just caching needs, then you are better off with specific caching related products (see at the end of the article).
4. Events based Processing Needs:
IMDG products support Complex Event Processing (CEP) based business architecture & ability to consume many events in scalable way.
5. High Availability (Failover support):
Failure of any node does not impact the cluster of nodes and as soon as failed node comes back, it starts contributing again seamlessly (without any configuration change or manual efforts). Also, real-time configuration change (e.g. changing cache high units) or product upgrade is possible without any downtime.
6. Scalability:
If there is need to add more nodes to your grid, it is seamless without any impact on existing grid. Mostly, IMDG products offer capabilities to be “linear scalable” to take full advantage of added capacity.
7. In-memory Database (IMDB) Support:
It also offers the entire database to be maintained in memory for faster response, throughput & performance. All the transactions can happen in memory and persisted asynchronously to database during non-peak hours.
8. Monitoring & Management:
Some products great real-time monitoring & management capabilities (also with JMX support) and it is very handy in troubleshooting or in finding out bottlenecks for improvements.
9. In-line with Cloud Computing:
With cloud computing as future, this is more important as it can offer “data as service” or “data virtualization”.

Commercial Products:
Oracle Coherence (earlier known as Tangosol Coherence), Gigaspaces XAP, IBM WebSphere eXtreme Scale (WXS), Tibco ActiveSpaces (recently launched), ScaleOut StateServer

Open-source Products: 
Terracotta, JBoss Infispan, Hazelcast

Other Distributed Caching Solutions are also available but in my opinion, they are not exactly offering entire IMDG capabilities, but you have only caching needs then they are worth considering (though out of context of this discussion):
NCache (only for Distributed Caching for .Net)
Apache JCS, Terracotta EhCache, OpenSymphony OSCache

Disclaimer:
All data and information provided on this site is for informational purposes only. This site makes no representations as to accuracy, completeness, correctness, suitability, or validity of any information on this site and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.This is a personal weblog. The opinions expressed here represent my own and not those of my employer or any other organization.