workflowfm / proter Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 17.88 MB

A discrete event simulator for asynchronous prioritized processes

Home Page: http://docs.workflowfm.com/proter

License: Apache License 2.0

Scala 100.00%

business-process-management discrete-event-simulation resource-management simulation simulation-framework

proter's People

Contributors

Stargazers

Watchers

proter's Issues

Resource Reservation

The Coordinator should allow the reservation of a resource even if a task does not start immediately. Some processes may need to do some preparation before the task (e.g. physically moving an asset to the resource) or to run multiple tasks back to back without interruption. In such cases, they don't want their resource to be hijacked in the meantime.

Unit tests for Simulation actor

Follow the test patterns for the Coordinator to create unit tests for the interaction with the Simulation actor.

Use SingleTaskSimulations or create dummy simulations.
Fake a Coordinator (directly or through a probe).
Send messages to the dummy simulations and see if they respond as expected.

Flow simulations (FlowSimulationActor) from #7 / #14 should be tested like this as well.

Avoid Long/Double conversions

We should maintain separate Distributions for Long and Double values.

For example Task.withDuration should (only) accept a Long:

proter/proter/src/main/scala/com/workflowfm/proter/Task.scala

Line 274 in 4124ffa

def withDuration(dur: Double): Task = copy(duration = Constant(dur))

Long to Double conversions may truncate and are deprecated in recent Scala versions (see compiler deprecation warnings).

Prevent infinite loops on error

Some errors may go unnoticed, so that the simulation state does not change, but none of the termination conditions are met.

For example, a task with a resource that does not exist will cause an infinite loop.

It is not enough to find and fix conditions like this example. There needs to be a mechanism that detects an unchanged state with little computation and terminates.

Expressing Task Permutation

This issue relates to the idea of adding tasks which start in the future. As of writing, the simulation is able to add future tasks to the coordinator (not to be confused with scala.concurrent.Future) but this is done by expressing the time at which this task starts.

The issue with the current approach is that it assumes the simulation knows when a task has finished and hence when a subsequent task can start ahead of time. This is true in the case of tasks with a ConstantGenerator duration, but if a task could take a variable length of time then this becomes complicated: Now the simulation cannot be certain when the task will finish, and it is possible for a second task to either start too early or with too much delay.

An alternative perspective on the issue is that we are implementing two possible methods for interacting with the coordinator:

The simulation adds tasks individually at the time when they should start. This is possible as the simulation is notified when a task completes, and using the complete method it can initiate new tasks at this point. This method is the one we use already.
The simulation adds all tasks at the start of the simulation. All tasks are added to the coordinator in the future using the new method of adding future tasks, and they must update dynamically as tasks complete to adapt to the time that each task took.

(note that a hybrid simulation that uses both of these methods should also be possible)

In order for task start times to update dynamically there needs to exist a way to express the rules which describe how to update this starting time.

Some ideas for possible solutions:

Every task has a boolean method, something like (List(Task))=>Boolean, which returns true when all the task's prerequisites are met and it can run. The method can be checked every tick after the originally planned starting time until eventually the task is started. The method can be defined with simple boolean logic, but it would be expressed individually for every task. Flows could implement this automatically, and if using approach 1 from above this method shouldn't need to be implemented.
When a task completes, all tasks which are dependant on it are notified. So tasks have an associated list of dependant tasks, and a method to decide if all prerequisites are met. This is similar to callbacks, and it wouldn't be necessary to include CEvents for starting tasks in the event queue since tasks are added the normal way once previous tasks finish.
the events priority queue could be ordered in some complex and dynamic manner. Maybe CEvent.compare() could return some abstract ordering which is only loosely based on time: some getTime() method could be used in CEvent compare(), and the returned value might change as time goes on: getTime() should always return time in FinishingTask and StartingSim (so their relative ordering is the same as it was up until now), but for the new AddingTask CEvent getTime() might return a different value when it is checked again, such that an AddingTask event always comes after a FinishingTask event which it is dependant on. I'm not sure if this would work, or if this is even possible.

Flows (#7) comes close to implementing what we need, but currently flows are contained in the simulation only. The coordinator has no idea of how the tasks are ordered, and we don't want to force users to use flows (e.g. we want PEW to still be able to use this without going through flows). Plus, flows are a method used to implement style nr.1 (as described above), but maybe the idea can be somehow adapted to solve this issue.

Simulation "Pre-Firing"

Here's the scenario: an And containing two Thens. In terms of the order in which tasks are added to the coordinator, first A and C would be added, once the future of A completes then B is added, once C completes D is added. But I'm concerned as to what happens if both A and C complete at the same time. If I find that A has completed, how can I be certain that I can call ready()? If only A completes, then I can check with something like A.onComplete{_=> if ( ! C.isComplete ) { //add task can call ready() } }, and similar for just C completing, however if A and C complete on the same time step, then it is possible that the A.onComplete segment is activated before C is registered as completing, and so I might add B to the coordinator but not D, while I should be adding both. I can't wait for C to complete, because that might not happen until the next time step. I can't be certain that I have the latest information, maybe the thread that computes C is slower than the one that computes A so there is a delay big enough to call ready() prematurely. What can I do?

This is how I understand the control flow of the system:

If this is correct, then perhaps fundamentally the problem is that we issue out tasks in one bulk message but we receive replies for each task individually. The coordinator is no better suited to tackle the issue because it faces the same challenge, so maybe the scheduler can be modified? Edit: this is wrong

Possible solutions:

Do we want to make the Coordinator report all the tasks that finish at a given time in one go? Or maybe a message saying "I'm done computing and waiting on you now"?
Do we want to get rid of Futures as the mechanism for managing tasks within the simulation? We can have a more generic callback mechanism, with Futures as just one possibility.
Getting rid of the futures on the simulation side at least as far as the flows are concerned. you could have explicit callbacks instead
Coordinator could wait for a "ready()" for each task that was finished, not just one

Remove println from EventHandler.scala

https://github.com/PetrosPapapa/WorkflowFM-Simulator/blob/e4de993ca28c84f0df9191611d5dcd815d5b1c45/src/main/scala/com/workflowfm/simulator/events/EventHandler.scala#L29

https://github.com/PetrosPapapa/WorkflowFM-Simulator/blob/e4de993ca28c84f0df9191611d5dcd815d5b1c45/src/main/scala/com/workflowfm/simulator/events/EventHandler.scala#L40

Oops!

Separate and abstract lookahead structure from the Scheduler

Currently (5271406) the simulation sends a mutable LookaheadObj to the Scheduler. The Scheduler then maintains that structure on every round of scheduling and resets the object between rounds.

The aim is to separate the lookahead structure from the scheduling computation. The goals are:

The simulation should send an immutable object to the scheduler.
The structure of that object should be defined by a Lookahead trait that is as abstract as possible, to allow for multiple possible implementations (as a principled ADT).
The temporary bookkeeping in each iteration/scheduling round should be done locally in the scheduler, so that any mutable state is local/safe.
We should work under the assumption that the simulation can update the lookahead object at any time.

The current LookaheadObj implementation should be a subclass (ideally a case class) of the new Lookahead trait.

Allow simulations to abort running Tasks

This came up while working on #28 and thought it would be a nice feature to have more generally.

Simulations should be able to abort any tasks that are already running for whatever reason (e.g. because they are being stopped).

Aborted tasks should release their resources and have appropriately shorter durations recorded in the metrics.

Export XES data

We should be able to export event data in the XES format from the simulation.

This would help link exported simulation data directly to process mining tools.

Graph-based task flows

When making a graphical user interface for the Proter API, I chose to represent flows as a graph, where nodes represent single instances of tasks and edges are the dependencies linking those graphs. The issue is, you can create a graph which cannot be represented by the Proter API's logic of IPAR and ITHEN, see the attached graph. That graph should be represented as Task 1 and Task 2 run in parallel and when both are finished Task 3 runs and when only Task 2 is finished then Task 4 can run, I cannot convert this logic in to IPAR and ITHEN. I am not sure what the best way to fix this in the Proter API would be.

One suggestion is representing the graph as a directed adjacency matrix of not only individual tasks, but also combinations of tasks. From the table attached, we can see that only the entry "Task 1 and Task 2" can reach Task 3 and either the entry "Task 2" or "Task 1 and Task 2" can reach Task four. This table does contain the logic that Task 3 can only occur after Task 1 and 2 is done, but Task 4 can occur after only Task 2 is done.

I can represent the task flow in any graph representation that is convenient for the front-end, the important feature is that every graph can be represent in the Proter API.

`TaskGenerator` should allow user controlled UUIDs for their `Tasks`

Delay explanation in simulation metadata

Adding to workflowfm/pew#23, another idea is to add metadata for the explanation of delays.

The Coordinatorcan provide explanations for each delay for all Tasks that remain unassigned in the queue at the end of the cycle.

e.g.:

Task A was delayed for 6 hours because task B was running in resource R.

It would be nice to record and output this information through the PiEvent metadata (or whichever way simulation analytics are managed).

The `Coordinator` is not properly waiting for tasks to complete

A "hybrid" task consisting of both simulation and computation will not get a chance to finish its computation before the Coordinator moves on.

Example

Simulate a task -> 2. on complete ask Coordinator the time -> 3. record some data -> 4. release

When (1) finishes, handleEvent is run in Coordinator.tick. (2) then sends a Ping. If Tack arrives at any point before (2) and (3) are completed (it seems to consistently arrive before Ping), workflowsReady will be true and we will proceed before any new tasks are registered.

Aborting a simulation does not detach&abort tasks with no resources

Replicate with a simple TimeLimit and a PrintEventHandler.

Suspect something is going wrong in TaskResource.abortSimulation, but not sure.

Remember to add resource cost in the task metrics

In the noAkka branch, I decided to make task creation easier by not calculating resource costs, keeping only the one-off task cost.

Before finishing that branch, I need to remember to add the resource cost(s) in the task metrics, probably when handling the resource attach or detach event.

Update metrics output with capacities

Existing metrics outputs (to String and CSV for instance) do not include information about resource and task capacities. This needs to be added.

Stopping a simulation

A Simulation needs to be stoppable, for instance through a stop() method.

The main motivation for this is to allow a finite time limit for the overall simulation, such that once a given timestamp is reached all active simulations will be stopped.

This might not be trivial for completely asynchronous simulations which we might not have control over. Having a requirement for them being "stoppable" does sound like a good idea generally though.

TimeLimit does not work

This issue was tackled by commit 93e67cd but it is still unresolved.

Encountered the problem when trying to implement arrival process on a separate branch. The time limit event does not seem to do anything. Events that are after the timelimit still happen. If there are no events left then the tick() function continuously loops ( Example: when adding println(time) to tick() and setting limit to 100, it is common to see "122" printed repeatedly in the console - the number 122 is probably due to the simulation duration that I was using at the time but the fact remains that the time is clearly exceeded but the program does not terminate).

This doesn't happen always, occasionally the program does terminate but it is more common for it to loop forever.

Looking into it a bit more it seems like the stop() function is never reached, this might be part of the issue??

Allow time limits per arrival

It might be useful to set a time limit for a specific arrival instead of a global one. For example, you could model individual working days by setting 8 hour limits to the arrivals.

Flows

It would be nice to have a mechanism to create multi-task simulations. This can be a simple language to compose TaskGenerators inspired by BPMN gateways.

It could include:

Then for sequence
And for parallel
All for multiple parallel tasks
Or for inclusive or (run both events, proceed when either completes)
Cond for selection based on a given condition (but what would that condition be? is there a use case for this?)
NoTask or similar for doing nothing
Wait for another simulation??!

It should be easy to compose the futures returned from SimulationActor.task().

We might need an implicit conversion from TaskGenerator to Flow. The hard part is making sure we run ready() at the right time(s).

We should subclass a FlowSimulationActor.

Everything should go under a new com.workflowfm.simulator.flows package.

AkkaManager is not a publisher

There is currently no way to subscribe to events from CoordinatorActor. This means we cannot extract any results in proter-akka at all. This is a sideffect from the removal of Subakka.

We can make AkkaManager a Publisher and have CoordinatorActor just send all events as messages.

Prevent ready/tick when events from current timestamp are still being handled

We need a solution that prevents tick() from being called when there are still events to be handled in the current timestamp. Perhaps eventsToBeHandled needs to be a global variable so that ready() can check that too (in addition to waiting).

Since there might be future extensions of DiscreteEvent we need a more structured way of implementing those such that forces the correct behaviour and prevents issues like this.

Waiting for simulations to respond seems to not be enough, as in the case of TimeLimit where we introduced a dummy simulation name (see #41).

See also #41 (comment)

TaskGenerators are no longer needed

After the changes in #25 we don't need TaskGenerators any more, with the caveats in the comments in #24, namely having to set the creation time of the Task.

We can simplify things by sending Tasks to the Coordinator and having the simulation itself do the random choices in the ValueGenerators. This will make most of our existing simulations that use constant duration and cost simpler to code.

Remove dependency from ScalaMock

ScalaMock is not compatible with Scala 3, and its reliance on macros makes it unlikely that a compatible version will be out soon.

Our options are to either:
a. Use another library such as Mockito.
b. Revert to stub-based testing.

Upgrade core codebase to a functional implementation

Migrating everything to an FP implementation will lead to substantial improvements in the code structure, resolve all the syncing and racing issues, get rid of Futures, and lead to more flexibility in the implementation of simulations (e.g. using type classes and monadic compositionality).

We will be using Cats Effect.

Events queue limitations

The events PriorityQueue needs to change. It is not possible to remove elements from the priority queue if they are not at the head of the queue , meaning I can't delete future tasks. Of course, I could manually dequeue elements one-by-one until I reach the one I want to delete, and then re-add all of the elements again, but this is far from ideal. A new structure should be chosen to replace the priority queue, or alternatively maybe future tasks should not be stored in the queue, and should only be added once they come to life- I'm not sure which is better.

On a similar note, I think there will be issues with updating a future task. If the task's starting time needs to be updated, then we can update the CEvent.time parameter (if we make it var, first) assuming that first we are able to retrieve the CEvent from the events queue (which is currently not possible due to the above reasons), however upon updating the time I don't think that the order of the priority queue would update.

Flexible resource matching

Currently each Task (or TaskGenerator) contains a list of resource names it requires. These are then picked up by the Scheduler using Task.taskResources and the resourceMap provided by the Coordinator.

It would be helpful to allow flexible resource matching, i.e. the potential to have more than 1 resource that can handle a particular task.

For example, one implementation could involve resources that have a list of capabilities. A Task might require one or more capabilities from one or more resources.

It would be even nicer if the matching function can be generalized to depend on the implementation.

This will complicate the Scheduler further as it will have to merge the appropriate (or all of?) the different combinations of resources that can handle each task and pick the optimal one.

Time limited simulations

Introduce a new CEvent that triggers the entire simulation to stop. Add a Coordinator message to add such an event.

This can allow a time limit in the simulations. A possible scenario is a "steady state" type of simulation where we spawn new simulations with a given "arrival rate" and stop after a certain period of time has passed.

This depends on #28.

Lookahead improvements

The current implementation is functional, but there are a few details which I believe could be improved and that need some consideration:

The scheduler schedules Tasks. With lookahead, the simulation needs to reply with a Seq[Task] to the scheduler to express future tasks which the scheduler must consider in its schedule. This forces me to generate tasks (with TaskGenerator.create()) in the simulation but previously tasks were only generated in the coordinator. The tasks generated in the simulation are not "real", they're only there for the scheduler to use and then they are forgotten about. Perhaps this is not a problem, but it feels wrong to be running TaskGenerator.create() for the same task, multiple times, in the simulation. There are two alternatives:
- The simulation could just generate all the tasks, and instead of sending the generator to the coordinator it can just send the tasks directly. This would let the simulation generate permanent task objects that are used for lookahead AND sent to the coordinator, instead of having two twin objects with the same parameters.
- The scheduler could do its thing without the entire task object- all it needs is the id, duration estimate, resources, and corresponding simulation actor. Maybe the coordinator and the simulation could pass just these details to the scheduler, so that the simulation doesn't have to generate a whole task object every time
Completed tasks need to be tracked, so that the functions that I use for lookahead can consider past tasks as well as current tasks. Right now a list of completed tasks is maintained in the lookahead trait in simulation, but maybe this should be tracked somewhere else? e.g. the simulation could ask the coordinator for which tasks have completed (but it would need to do this every time the scheduler queries the simulation, which adds a lot of unnecessary coordinator messages to the mix)
The biggest drawback of the implementation is the numerous akka calls to Simulation every time the scheduler runs. There are two big possible improvements here:
1. Maybe it's possible for the scheduler to save a schedule that it has made in the past, and if nothing has changed since last time then it could refer to the old schedule instead of making a new one from scratch. The possible downside here is that the simulation might somehow add new information to the lookahead data, or tasks could take a different time than what was predicted, and it might be difficult to alert the scheduler that these things have happened and that a new schedule should be made.
2. Instead of calling the simulation, the scheduler could be given an object which answers the questions that are currently asked of the simulation directly. So the simulation defines a new lookahead object, and sends it to the scheduler. The scheduler just queries this object for all it's lookahead needs, and if something needs to change in the lookahead then the simulation could simply send an updated lookahead object to the scheduler. The current implementation uses ask and Await constantly, so there is no use of the concurrency of akka actors. With an object, the flow is automatically sequential, and the large quantity of akka messages is removed from the implementation.

This last point also concerns the alternative "pre-loading" idea for lookahead:
If tasks are pre-loaded to the coordinator, the coordinator would also need info about the sequence in which the tasks run. This would need to be provided to the coordinator by the simulation, and maybe stored in some sort of set or map. The benefit of this is that the coordinator could speak directly with the scheduler to implement lookahead instead of the scheduler messaging the simulation many times. However, if a lookahead object is used instead, then we gain the same benefits without introducing task logic into the coordinator, meaning that the coordinator would still be oblivious to lookahead, and the simulation would still communicate everything that it needs to say to the scheduler without all of the messages. It feels like there are more negatives than there are positives to the pre-loading approach, and I think it would be better to implement something like this instead.

Simulation "Time Warp"

If the start time of a task is "NULL" in tasks.csv, it will appear as if it starts at time 1 in the timetable page.

This can happen when a task is registered with the coordinator but cannot start yet according to the scheduler due to a resource being used, and then the simulation ends prematurely due to a TimeLimit event.

Child Simulation

A Simulation should be able to create simulation children, run them on the same Coordinator and wait for them to complete before making more progress.

We would need a Simulation.sim(name: String, actor: ActorRef) type of method (or similar).

It would be nice and symmetrical to maintain this in the Coordinator, the same way that we maintain Tasks.

We can extend:

the AddSim messages,
the StartingSim event and
the simulations set

They need to include an Option[ActorRef] where we report when a (child) simulation has finished (while adding that actor to the waiting list).

Do we need simulations to have a UUID so the simulation can "ack" them when finished reacting, similarly to tasks? How do we add this without too much additional complexity? We could use the simulation name or ActorRef instead (also assumed unique), but they won't fit in waiting. If we add another waiting map, we'll have to check in that too every time we check a simulation is ready to go.

Update RandomFlowFactory

The RandomFlowFactory class needs to be updated to the use FP/Cats and potentially extended with additional parameters.

Metrics shouldn't use System.currentTimeMillis()

Can we use Clock[F] instead? This may need to refactor (these specific?) metrics updates to be effectful.

Adding Tasks in the Future

Simulations should have the ability to add Tasks that will start at some point in the future to the Coordinator.
This would allow the inclusion special events like scheduled maintenance to our simulations.

The simulation should:

be able to add tasks that start at a future time
be able to remove/edit future tasks

The coordinator should:

track when to start a future task (possibly with a new CEvent)
store and start a future task when the time comes OR ask the simulation for the task once the time has come.

Some of the possible challenges associated with this have already been discussed in #18, where we consider the difficulties of maintaining future tasks in the CEvent priority queue.

Adding a time limit should not add idle time after the last event

At the moment if all simulations finish before the time limit, then the coordinator jumps ahead to the time limit, adding idle time to all resources.

It might be useful to avoid this in some situations, perhaps with an optional flag.

In such a case, the coordinator would mark the end of the run based on the timestamp of the previous event (the one coming before the time limit hits) if there are no simulations running during that.

Generalize Simulation Actor

Following up from #9 and some ideas from development of #7, it seems that we can significantly simplify and generalize SimulationActor.

#9 means the tasks can once again be registered directly to the Coordinator. The main SimulationActor interaction now only involves SimStarted, SimReady, AckTasks, and SimDone. This means the fundamental behaviour can be stripped down significantly.

We then need callback functionality for tasks that are completed. This was done in 5e7e631. We can factor this as a subclass CallbackSimulationActor (or something) of the generic SimulationActor.

The current behaviour for callbacks via Futures can form another subclass FutureSimulationActor.

The queue of waiting tasks should belong to the Scheduler instead of the Coordinator

Coordinator.tasks should be managed in the Scheduler trait through appropriate methods.

Some Scheduler implementations may need a different data structure other than SortedSet to maximize efficiency.

Composite Tasks

Typically, and especially in the context of lookahead, a chain of tasks may be broken if a higher priority task appears in a separate chain. Eg. consider the flow (1 > 2 > 3) + (4 > 5 > 6), where all tasks share one common resource. It could be the case that the tasks inside the brackets must happen one after another without interruption, but due to priority of tasks the second chain could start before the 1st finishes. In this example this is easily solved by making 1,2,3 High priority and 4,5,6 Medium priority, but there exist more complex scenarios where adjusting the priority like this is not possible.

A second motivation is where one may wish to have tasks of low priority followed by tasks of high priority. For example a surgery which is not urgent (a low priority task), but as soon as the surgery is completed a follow-up treatment must begin as soon as possible (a high priority task). Currently if we were to express this with priorities only, it could be the case that some other chain of tasks has taken up the resources in such a way that the surgery could start but the treatment would be blocked, and our system has no way of knowing that these tasks must happen directly one after another.

I think a good solution to this challenge is to introduce composite tasks:
A task could be marked as being part of a certain composition, and all tasks in this composition must run together. This means that for lookahead, all tasks in a composition are considered before any single task is registered, and once the first task is registered then all of the tasks in the composition should be registered before the next task is picked up from the sorted set.
This system could allow for complex amalgamations of tasks which use different resources at different times but must happen all together, like a complex shape on the resource timeline that cannot change shape, or like a puzzle-piece; Right now we can "draw" rectangles on the timeline by saying a task uses x,y,z resources, but with this feature we could describe more complex shapes by composing multiple tasks.

Coming back to the above examples, tasks 1,2,3 would be part of composite 1, and tasks 3,4,5 would be in composite 2. For second example, the surgery and the follow-up treatment would be part of the same composition.

This wouldn't work without lookahead since two tasks at the start of two separate compositions could be scheduled, but the compositions could overlap once time progresses. With lookahead this can be avoided, so the entire composite can be guaranteed to run all the way through without disruption.

Simulations should not be forced to have unique names

The user should not need to pay attention to produce unique simulation names. We can use internal UUIDs for simulations as we do for tasks.

workflowfm / proter Goto Github PK

proter's People

Contributors

Stargazers

Watchers

proter's Issues

Example

More general solution

Recommend Projects

Recommend Topics

Recommend Org