Git Product home page Git Product logo

eigenflow's People

Contributors

dmitri-carpov avatar jonas avatar sheenaluu avatar wadewaldron avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

eigenflow's Issues

Flesh out README

Motivation

To better communicate and establish what eigenflow is intended and good for we need to add more documentation which provides some example usecases and highlights some comparables.

Input

Current README.md

Output

Updated README.md

Test

  • Clear specification of the problem eigenflow is designed to handle
  • [ ] Comparison and contrast between eigenflow and at least two other similar things
  • Clear specification of what eigenflow is not intended for

Eigenflow hangs if error is thrown during EigenflowBootstrap initialization

Motivation
If an error is thrown during eigenflow initialization the actor system doesn't get shutdown properly and the entire application remains stuck. No error is logged to the main job log, but a stracktrace can be found in the stderr log (see https://issues.ypg.com/browse/MPN-2889).
This causes several problems:

  1. The job does not restart automatically as scheduled the next day. So manual intervention is required to get the job running again.
  2. Because the failure happens so early no useful information is logged in the main job logs. We have to dig around to find the stderr logs to see the stacktrace.
  3. Also because the failure happens so early, no notifications are generated.

Input
Error thrown during EigenflowBootstrap initialization for example, see stacktrace in MPN-2889

Output
Application shuts down.

Test
Throw an error during EigenflowBootstrap initialization.
Check ps to verify that application should no longer be running.

Events published to message bus should include "processId" inside the event.

Right now, when a message is published to the message bus, the id for the process is embedded as part of the topic name. This means that on the consumer end, if we want the process id, we have to reverse engineer the topic name.

It would be better if that information was included as part of the event so that we could extract it in the consumers.

EigenFlow Kafka Publisher should use a single topic per message type

The EigenFlow platform currently publishes to a separate topic per job. This means consumers need to know what jobs exist in advance which is not ideal.

Instead it could publish to a single topic per event type (RunEvent, StageEvent, MetricsEvent). The events would then include the Job Id. This would mean consumers would not need to be aware of the different job types in advance.

Implement exponential backoff retry strategy

  • Motivation
  • Input
    • The com.mediative.eigenflow.domain.Retry type which implements potential retry strategies
  • Output
    • The above type with the option to specify an exponential backoff
  • Test
    • A sample (dummy) job implementing some trivial failing action and demonstrating that an exponential backoff strategy was followed.

Rename process.id to process.name

Currently ProcessType, Process are used and the processId field in events messages conflicts with process.id in configuration. Need a more consistent language.

Actor supervision strategy

This issue is created to track a discussion of actor supervision strategy. When an actor fails in runtime there is no strategy defined how to handle those. It leads to a "hung" process.

copy/paste from a pr discussion:


there may be scenarios where recovery doesn't complete (due to failures encountered etc). this leads to two questions which need to be dealt with in the context of #14 and would need to be resolved before this PR in it's current form can be merged (since it's merger will close #14). if needed, the commit could be reworded if we don't want to wait for the questions below before merging this PR (resolving questions will take time). the questions are:

  • what are the scenarios (failure or otherwise) that would lead to onRecoveryCompleted does not fire
  • how should we behave in such scenarios? should we still run the job from the date in question obeying the reset flag?
  • updates to the code implementing any tweaks identified above

In my opinion, the strategy of dealing with actor failures has a wider scope than the current bugfix. Therefore, I suggest to open another issue to continue this discussion.

Migrate processDate to java.time.LocalDateTime

Motivation

Having access to the java.time API for the processingDate without additional date conversions.

Input

processingDate of java.util.Date type.

Output

  • processingDate of java.time.LocalDateTime type.
  • Examples code in the "Time Management" chapter updated - #16 (comment)

Test

val aStage = AnyStage withContext { context: ProcessContext =>
  context.processingDate // check if this is actually java.time.LocalDateTime
}

`StagedProcess#nextProcessingDate` does not currently have a sensible default

In the majority of cases, we just want it to be the start of the next day. This should save a lot of boilerplate in jobs derived from StagedProcess. One way to do this using only Java standard date/time APIs (extract from REPL):

@ import java.util._ 
import java.util._

@ new Date() 
res1: Date = Thu Dec 08 14:12:54 EST 2016

@ Calendar.getInstance 
res3: Calendar = java.util.GregorianCalendar[time=1481224422011,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Toronto",offset=-18000000,dstSavings=3600000,useDaylight=true,transitions=231,lastRule=java.util.SimpleTimeZone[id=America/Toronto,offset=-18000000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2016,MONTH=11,WEEK_OF_YEAR=50,WEEK_OF_MONTH=2,DAY_OF_MONTH=8,DAY_OF_YEAR=343,DAY_OF_WEEK=5,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=2,HOUR_OF_DAY=14,MINUTE=13,SECOND=42,MILLISECOND=11,ZONE_OFFSET=-18000000,DST_OFFSET=0]

@ res3 setTime res1 

@ res3.set(Calendar.AM_PM, 0) 

@ res3.set(Calendar.DST_OFFSET, 0) 

@ res3.set(Calendar.HOUR, 0) 

@ res3.set(Calendar.HOUR_OF_DAY, 0) 

@ res3.set(Calendar.MILLISECOND, 0) 

@ res3.set(Calendar.MINUTE, 0) 

@ res3.set(Calendar.MINUTE, 0) 

@ res3.set(Calendar.SECOND, 0) 

@ res3.set(Calendar.ZONE_OFFSET, 0) 

@ res3.add(Calendar.DATE, 1) 

@ res3.getTime 
res15: Date = Fri Dec 09 00:00:00 EST 2016

EigenFlow should support a feature to terminate a job early

EigenFlow should have a feature that allows a job to terminate early. This feature would allow a job to determine that it has no need to continue through the later phases (for example in a double load scenario). In this case the job should be able to specify that rather than failing we just want to terminate the run early, and then the next run can continue as normal.

Add sequenceId to messages

Motivation
Sometimes stage switches happen within one millisecond, what generates to event messages with the same timestamp. The platform should provide a sequenceId to allow to restore the order of events.

Input
Current messages w/o sequenceId

Output
Messages with sequenceId

Test

  • Check if unit tests cover that
  • Run test job and check sequenceId in printed messages.

Process stages can't serialise `(): Unit`

Process stages can't serialise the (): Unit value. E.g. if we have

case object Test extends ProcessStage
val test = Test { _: String => Future(()) }

We get an error:

No implicit view available from Unit => String.

Solution: add the suggested implicit view?

Bug: Start date override starts a run from the last failed stage.

Motivation: Eigenflow should provide a way to run a process with a specific start date, ignoring any run history.

Input: When the start date override is set the current behaviour is the process runs from the last failed stage.

Output: When the start date override is set the process should run from the beginning, ignoring the last failed stage.

Test:

  1. Set the start date override, run the process, cause a failure mid-run (via kill -9 if necessary).
  2. With the same start date override, run the process again. The logs should show that the process started from the beginning instead of from the last failed stage.

Nomenclature change

Motivation
Using a common terms in the platform to avoid confusion related to ambiguous process, process run, process id, process name

Input
Currently used:
Process - for the whole workflow including all runs of all processing dates
Process Run - for a run of one specific processing date
? - JVM execution which may contain multiple Process Run's or sometimes one Process Run maybe completed using multiple executions
Process Id - Id of the Process Run
Process Id - The id used to identify the Process - confusion!

Output
The following rename of terms:

Process -> Job
Process Run -> Job Instance
? -> Job Execution
Process Id for Process Run -> Instance Id
Process Id for Process -> Job Id

Generic Messages Support

Motivation
Allow to send String type messages to eigenflow message queue

Input
Current implementation which allows to send only metrics messages (Map[String, Double])

Output

  • DSL or function for sending a generic (String type) messages
  • Documentation how to use it

Test

  • Review unit tests.
  • Check documentation
  • Test functionality using a testjob

Exclude slf4j binding dependencies.

Motivation
Remove log4j and slf4j-log4j12 transitive dependencies provided through kafka

Input
Eigenflow implicitly depends on slf4j bindings, what may cause logging issues.

Output
No more slf4j binding dependencies

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.