Git Product home page Git Product logo

rbmhtechnology / eventuate Goto Github PK

View Code? Open in Web Editor NEW
706.0 70.0 103.0 10.1 MB

Global-scale event sourcing and event collaboration with causal consistency (This project is in maintenance mode. Only critical bugs will be fixed, but there is no more feature development.).

Home Page: http://rbmhtechnology.github.io/eventuate/

License: Apache License 2.0

Scala 88.00% Shell 0.35% Python 0.57% HTML 1.30% CSS 0.05% JavaScript 0.37% Java 9.35%

eventuate's Introduction

Gitter Build Status Stories in Ready

Eventuate

Please note: This project is stopped, no bugfixes, features, etc. are done.

Eventuate is a toolkit for building applications composed of event-driven and event-sourced services that communicate via causally ordered event streams. Services can either be co-located on a single node or distributed up to global scale. Services can also be replicated with causal consistency and remain available for writes during network partitions. Eventuate has a Java and Scala API, is written in Scala and built on top of Akka, a toolkit for building highly concurrent, distributed, and resilient message-driven applications on the JVM. Eventuate

  • provides abstractions for building stateful event-sourced services, persistent and in-memory query databases and event processing pipelines
  • enables services to communicate over a reliable and partition-tolerant event bus with causal event ordering and distribution up to global scale
  • supports stateful service replication with causal consistency and concurrent state updates with automated and interactive conflict resolution
  • provides implementations of operation-based CRDTs as specified in A comprehensive study of Convergent and Commutative Replicated Data Types
  • supports the development of always-on applications by allowing services to be distributed across multiple availability zones (locations)
  • supports the implementation of reliable business processes from event-driven and command-driven service interactions
  • supports the aggregation of events from distributed services for updating query databases
  • provides adapters to 3rd-party stream processing frameworks for analyzing event streams

Documentation

Project

Community

eventuate's People

Contributors

agido-hundt avatar agido-jrieks avatar agido-schallaboeck avatar danbim avatar dispalt avatar eventuate-test avatar fijolekprojects avatar gabrielgiussi avatar ianclegg avatar kongo2002 avatar krasserm avatar maekl avatar magro avatar mslinn avatar odd avatar purthaler avatar raboof avatar tvaroh avatar volkerstampa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

eventuate's Issues

One event-sourced actor per CRDT instance

At the moment, there is one EventsourcedActor per CRDT type. While this is sufficient for CRDTs that don't require precondition checks atSource it limits write throughput for those that require these checks.
In the latter case, the EventsourcedActor must have set stateSync to true which more or less turns off write batching for a single actor.

By distributing CRDT instances of a given type across actors, their concurrent writes can be batched again by BatchingEventLog, increasing write throughput. The cost is more memory overhead per CRDT instance. Ideally, the user can configure whether to use the one-actor-per-CRDT-instance or the (current) one-actor-per-CRDT-type approach.

Another advantage of having one EventsourcedActor per CRDT instance is that they can be loaded and discarded from memory on demand. With a reasonably small number of updates per CRDT instance (e.g. < 500) snapshots may not be necessary either.

Scale write to a single replicated aggregate

This can be achieved by allowing an EventsourceActor (= aggregate) to consume from multiple logs and only write to one of these logs. Aggregate replicas with the same aggregate id write to different logs but consume from all of them, so that distributed writes to the same aggregate id don't need to be merged into a single log. For example, with three (optionally replicated) logs, A, B and C and three aggregate replicas r1, r2 and r3:

  • r1 writes to A and reads from A, B and C
  • r2 writes to B and reads from A, B and C
  • r3 writes to C and reads from A, B and C

This means that replicas ri need to be recovered from three logs (instead of one) and deterministic (= totally ordered) replay is not possible any more. However, by re-ordering concurrently replayed events based on their vector timestamps, causal ordering can be achieved.

Topic-based event routing

#45 introduces basic routing capabilities. This should be extended to allow

  • producers to publish events to application-defined topics and
  • consumers to consume events from application-defined topics

Topic-based event routing must preserve causality of events and should be possible for event routing between actors that share a single, distributed log. Routing across different logs can be done with processors.

Furthermore, it should be investigated if routing decisions must be made persistent, so that past routings are repeatable. On the other hand, a changed routing rule is somewhat equivalent to changed event handler logic which isn't persistent either.
#130 could be used as implementation basis.

Batch update notifications

Update notifications from event log to replication servers are too fine-granular at the moment. Send update notifications in larger batches, especially when load is high. Under low and moderate loads, update notifications should dynamically become more fine granular. This shall significantly reduce the number of update notifications sent from a replication server to replication clients over the wire.

EventsourcedActorIntegrationSpec failure

[info] Eventsourced actors and views
[info] - must support conditional command processing *** FAILED ***
[info]   java.lang.AssertionError: assertion failed: expected a, found delayed-2
[info]   at scala.Predef$.assert(Predef.scala:165)
[info]   at akka.testkit.TestKitBase$class.expectMsg_internal(TestKit.scala:339)
[info]   at akka.testkit.TestKitBase$class.expectMsg(TestKit.scala:315)
[info]   at akka.testkit.TestKit.expectMsg(TestKit.scala:718)
[info]   at com.rbmhtechnology.eventuate.EventsourcedActorIntegrationSpec$$anonfun$2$$anonfun$apply$mcV$sp$8.apply$mcV$sp(EventsourcedActorIntegrationSpec.scala:237)
[info]   at com.rbmhtechnology.eventuate.EventsourcedActorIntegrationSpec$$anonfun$2$$anonfun$apply$mcV$sp$8.apply(EventsourcedActorIntegrationSpec.scala:214)
[info]   at com.rbmhtechnology.eventuate.EventsourcedActorIntegrationSpec$$anonfun$2$$anonfun$apply$mcV$sp$8.apply(EventsourcedActorIntegrationSpec.scala:214)
[info]   at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22)
[info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
[info]   ...

One aggregate per actor pattern

Efficiently support the pattern to have one DDD Aggregate per EventsourcedActor. The example application still follows the many aggregates per actor pattern but it has already been re-written by @remcobeckers to follow the one aggregate per actor pattern (see also this discussion). Eventuate should support efficient implementations of this pattern by providing the following enhancements:

  • the event log must be indexed by aggregate id for efficient recovery of individual aggregates
  • if an aggregate actor writes an event, the event log should only notify those aggregate actor replicas with the same aggregate id (and not all other actors subscribed to the event log). Nevertheless, EventsourcedViews or non-aggregate EventsourcedActors continue to receive all events written to the event log (to support the one view for many actors pattern, for example).
  • implement EventsourcedAggregate as specialization of EventsourcedActor to support implementing the one aggregate per actor pattern. Update: EventsourcedActor has an optional aggregateId.
  • consider implementing aggregate services similar to CRDT services to free applications from dealing with low-level details.
  • the one aggregate per actor prototype is already implemented in a way that vector clock sizes scale with the number of replicas only instead of the potentially large number of aggregates. Ensure that an implementation of EventsourcedAggregate cannot break this property.

More detalied copyright & license header

At the moment the copyright and licence header on top of every source file is very sparse.
We should add the URL to the Red Bull Media House in the copyright and also add the licence text that is proposed in the Apache2 license.

Full header that should be applied:

Copyright (C) 2015 Red Bull Media House GmbH <http://www.redbullmediahouse.com> - all rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); 
you may not use this file except in compliance with the License. 
You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed 
under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES 
OR CONDITIONS OF ANY KIND, either express or implied. 
See the License for the specific language governing permissions and limitations under the License.

Replication/Transport encryption

As EventSourcedActors are most likely deployed in two different locations and the log gets replicated over the internet it would be very beneficial to have encryption between two endpoints.

Multiple logs per replication endpoint

A replication endpoint should support the replication of multiple logs. Here, each log represents a partition and causality is preserved only within a partition.

In the following example, two partitions, named "log1" and "log2", are created for the local replication endpoint and connections are established to remote replication endpoints listening at 127.0.0.1:2553 and 127.0.0.1:2554. The system expects that remote endpoints also have partitions with the same names.

val logNames = Set("log1", "log2")
val connections: Seq[ReplicationConnection] = Seq(
  ReplicationConnection("127.0.0.1", 2553),
  ReplicationConnection("127.0.0.1", 2554))

new ReplicationEndpoint("my-endpoint-id", logNames, id => 
  LeveldbEventLog.props(id), connections)(system)

Log partitioning is not only useful for partitioning events (for example, to support multiple tenants) but can also be used to scale writes (assuming this is also supported by the underlying storage). However, having multiple logs per replication endpoint is not a prerequisite for horizontal scaling; this can also be achieved by deploying logs (i.e. partitions) on different machines, each having its own replication endpoint, for example.

Batching layer on top of event log

For batching writes of

  • a single or multiple event sourced actors running with sync = true
  • multiple event sourced actors running with sync = false

The batching layer must be optimized for latency with growing batch sizes (up to a maximum) under increasing load.

The batching layer has no effect when running a single event sourced actor running with sync = true because the actor submits the next write when previous write finished.

API cleanup

  • Clean separation of public from private API
  • Drop backwards compatibility to initial prototype API

Custom event routing

After #36 is implemented, events are routed to Eventsourced actors (= destinations) based on the following default rules:

  1. if destination.aggregateId is not defined, the destination will receive all events regardless whether event.aggregateId is defined or not.
  2. if destination.aggregateId is defined, the destination will only receive events with matching event.aggregateId.

This shall be generalized to allow custom routing to destinations. When persisting events, EventsourcedActors should be able to specify a set of additional aggregate ids as routing destination. The default routing behavior as described above should be preserved.

An EventsourcedActor should also be able to opt-out from routing of events to replicas with the same aggregate id (by specifying an empty set). In this case, the only consumer of the event is the emitting EventsourcedActor. Routing of events to destination that have no aggregate id defined cannot be overriden. These destination must always receive all events from the event log (but can of course choose to ignore them).

Limit vector clock size

Vector clock size should scale with the number of actual event collaborators and not with the total number of EventsourcedActors on the same distributed log. In other words, only those events that are actually handled by an EventsourcedActorcontribute to its vector clock.

CRDT write throughput optimizations

Counter and MVRegister can be operated in an EventsourcedActor that is configured with sync = false as these CmRDTs don't need to check pre-conditions in update phase 1. This optimization has a significant impact on throughput.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.