Git Product home page Git Product logo

ehcache-jms-wan-replicator's Introduction

ehcache-jms-wan-replicator

This is customizable module that builds upon Ehcache's replication framework (specifically the JMS Replicated Caching module) that allows for lighter weight cache events to be disseminated across different data-centers in a hub/spoke pub-sub model w/ batching. First implementation leverages Nevado over AWS SNS/SQS and is inteded to be used for GSLB'd applications where eventual (near-real-time) cache consistency is acceptable, and preferred over relying soley on cache TTLs.

Caveat: this has yet to be used in a production environment, but is currently being evaluated/tested as part of a larger project where it (or something like it) might be levereged.

Alt text

What does this add to Ehcache JMS replication?

  • Decorates the OUTBOUND JMS Messages with additional custom key-value pairs (strings)
  • Defines primitive JMS selectors to control which INBOUND replication messages are actually processed
  • Adds batching of events in a simplified JMS TextMessage optimized for sending hundreds of events in one message; works much better with SNS/SQS JMS backends. (avoiding Java serialization based ObjectMessages when possible)
  • Permit totally overriding how local Ehcache events actions are to be replicated. For example saying a local PUT will be replicated to the other DC's as a EXPIRE, or a PUT will do NOTHING, or an UPDATE yields a REMOVE etc. Ehcache already gives you some hooks for via replicatePuts, replicatePutsViaCopy and replicateUpdatesViaCopy, but you may want to completey override everthing as you wish, and this module lets you do that explicitly.

What kinds of apps has this been used with?

This has been tested in a standalone dummy application environment, via unit-tests as well as configured for Liferay Portal 6.2 in a GSLB'd topology similar to the diagram below (multiple WAN connected data-centers); where all Liferay caches are additionally bound to this cache peer provider. Enabling batching is critical for this setup.

Background

The original need for this little module came out of a need ensure that caches across a GSLB'd (globally load balanced) application could be kept reasonably in "sync" without having to push the actual cached data around across WANs and allow the user to configure more explicitly what local events actions get translated to for cross-dc event replication. The specific need is for an app that uses Ehcache internally for caching various data points.

In the use-case that drove this little test; within any given data-center that runs the app, Ehcache is already configured to use the existing Ehcache replication facilities (RMI/JGroups) for within-DC peer node replication. Generally, for cache puts/updates Ehcache always includes the actual cached data in its replication. Note that this is configurable via the replicatePuts, replicatePutsViaCopy and replicateUpdatesViaCopy etc to force removes on update/put.

Point being, that for a different data-center across the world, there is really no need to send that cached data over the wire; or even go so far as to just not send anything (i.e. a PUT might warrant nothing happening) Instead you really want to be able to fully override the replication behavior per local DC action. So in the case of a local update forcing a dc-replication cache remove, then just let the next request for the data trigger a cache-miss and repopulate the latest data from the shared data-source (which is replicated/clustered via a totally separate process). The overall speed of how soon a cache event, triggerring a remove in another DC actually happens.... basically comes down to the principal of eventual-consistency. Everyones individual requirements will vary, but my requirement is basically within a few minutes; which is better than forcing disseparate DC's to rely soley on TTLs which might be much much longer.

So what does this all mean in how it relates to this chunk of code? In a nutshell this code allows you to define separate cacheManagerPeerProviderFactories and cacheEventListenerFactories that build upon the Ehcache JMS Replication functionality, but specifically adds the following features, key do doing whats described above.

  • Decorate the OUTBOUND JMS Messages with additional custom key-value pairs (strings)
  • Define primitive JMS selectors to control which INBOUND replication messages are actually processed
  • Support batching of events in a simplified JMS TextMessage optimized for sending hundreds of events in one message; works much better with SNS/SQS JMS backends. (avoiding Java serialization based ObjectMessages when possible)
  • Permit totally overriding how local Ehcache events actions are to be replicated. For example saying a local PUT will be replicated to the other DC's as a EXPIRE, or a PUT will do NOTHING, or an UPDATE yields a REMOVE etc. Ehcache already gives you some hooks for via replicatePuts, replicatePutsViaCopy and replicateUpdatesViaCopy, but you may want to completey override everthing as you wish, and this module lets you do that explicitly.

With the above three features you can then configure the JMS replication to ignore messages generated from the local DC (via selector and custom replication message properties) and publish messages to other DC's that are lighterweight (i.e. we only tell other DC's to REMOVE for any cache event that occurs locally)

Think of this as a second DC to DC cluster layer that is entirely separate from your within DC echache replication. See diagram below

This implementation was specifically built around using Nevado JMS which lets us leverage AWS (SNS/SQS) for the globally available "topic" that all DC's subscribe too.

Getting Started

  • Look at the Unit Test that boots up 3 separate instances of Ehcache and validates the behavior of "updates" resulting in "removes" on the other "dc instances" and utilizes the batching behavior.
  • Check at the echcache.xml example config file
  • You will need an AWS account and access to SNS/SQS, define a topic there and get an accessKey/secretKey that need to go into the peer provider configuration. See the ehcache.xml example files.

If this evolves I'll update this w/ more info.

Reference

author: bitsofinfo.g[at]gmail.com

ehcache-jms-wan-replicator's People

Contributors

bitsofinfo avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Forkers

eupraxialabs

ehcache-jms-wan-replicator's Issues

Add batching of events

Basically SQS w/ its 10 message receive limit and the way nevado polls. This can be way to slow when an app generates thousands of events.

Need a way to batch the events

a) JMSCachePeer or JMSCacheReplicator….. :
- proxy or extend
- add batch queueing to rollup multiple JMSMessages into one message
(i.e. like 100)

b) MessageListenerWrapper
- detect batch messages and explode those back into multiple JMSMessages
then delegate to the CachePeer

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.