lf-lang / reactor-rs Goto Github PK
View Code? Open in Web Editor NEWReactor runtime implementation in Rust
License: MIT License
Reactor runtime implementation in Rust
License: MIT License
As discussed in lf-lang/lingua-franca#1309
When executing the guided search benchmark sometimes it softlocks during an arbitrary iteration. This might be related with #2, but there it only locks up during startup.
Here is an execution log excerpt where it hangs up:
[2022-01-25T14:36:48Z INFO ] Worker 0: search path from [25, 26, 17] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[1]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 1: search path from [25, 27, 14] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[2]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 2: search path from [26, 26, 14] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[3]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 3: search path from [26, 26, 15] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[4]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 4: search path from [25, 28, 13] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[5]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 5: search path from [26, 27, 13] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[6]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 6: search path from [26, 28, 13] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[7]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 7: search path from [26, 28, 12] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[8]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 8: search path from [27, 28, 13] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[9]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 9: search path from [27, 28, 14] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[10]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 10: search path from [28, 27, 14] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[11]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 11: search path from [29, 26, 13] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[12]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 12: search path from [28, 28, 13] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[13]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 13: search path from [28, 28, 14] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[14]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 14: search path from [29, 27, 14] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[15]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 15: search path from [29, 26, 14] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[16]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 16: search path from [29, 27, 12] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[17]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 17: search path from [28, 28, 12] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[18]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 18: search path from [29, 27, 13] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[19]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 19: search path from [28, 26, 15] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] Pushing at (T0 + 0 ns = 0 ms, 283): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] Processing event at (T0 + 0 ns = 0 ms, 283): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] - Running late by 3966544146 ns = 3966544 µs = 3966 ms
[2022-01-25T14:36:48Z TRACE] - Level 6
[2022-01-25T14:36:48Z TRACE] - Executing /manager/2 (level 6)
[2022-01-25T14:36:48Z TRACE] - Level 10
[2022-01-25T14:36:48Z TRACE] - Executing /workers[0]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 0: search path from [23, 24, 20] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[1]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 1: search path from [23, 25, 20] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[2]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 2: search path from [25, 24, 20] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[3]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 3: search path from [25, 25, 19] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[4]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 4: search path from [24, 26, 19] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[5]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 5: search path from [24, 25, 20] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[6]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 6: search path from [23, 27, 20] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[7]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 7: search path from [24, 27, 19] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[8]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 8: search path from [25, 25, 20] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[9]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 9: search path from [25, 26, 19] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[10]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 10: search path from [25, 24, 19] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[11]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 11: search path from [26, 25, 19] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[12]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 12: search path from [27, 26, 16] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[13]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 13: search path from [27, 26, 17] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[14]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 14: search path from [25, 27, 19] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[15]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 15: search path from [25, 27, 18] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] - Executing /workers[16]/1 (level 10)
[2022-01-25T14:36:48Z INFO ] Worker 16: search path from [26, 26, 18] to [24, 24, 24]
[2022-01-25T14:36:48Z TRACE] Pushing at (T0 + 0 ns = 0 ms, 284): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] Processing event at (T0 + 0 ns = 0 ms, 284): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] - Running late by 3967258312 ns = 3967258 µs = 3967 ms
[2022-01-25T14:36:48Z TRACE] - Level 6
[2022-01-25T14:36:48Z TRACE] - Executing /manager/2 (level 6)
[2022-01-25T14:36:48Z TRACE] Pushing at (T0 + 0 ns = 0 ms, 285): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] Processing event at (T0 + 0 ns = 0 ms, 285): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] - Running late by 3967310524 ns = 3967310 µs = 3967 ms
[2022-01-25T14:36:48Z TRACE] - Level 6
[2022-01-25T14:36:48Z TRACE] - Executing /manager/2 (level 6)
[2022-01-25T14:36:48Z TRACE] Pushing at (T0 + 0 ns = 0 ms, 286): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] Processing event at (T0 + 0 ns = 0 ms, 286): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] - Running late by 3967361034 ns = 3967361 µs = 3967 ms
[2022-01-25T14:36:48Z TRACE] - Level 6
[2022-01-25T14:36:48Z TRACE] - Executing /manager/2 (level 6)
[2022-01-25T14:36:48Z TRACE] Pushing at (T0 + 0 ns = 0 ms, 287): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] Processing event at (T0 + 0 ns = 0 ms, 287): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] - Running late by 3967411629 ns = 3967411 µs = 3967 ms
[2022-01-25T14:36:48Z TRACE] - Level 6
[2022-01-25T14:36:48Z TRACE] - Executing /manager/2 (level 6)
[2022-01-25T14:36:48Z TRACE] Pushing at (T0 + 0 ns = 0 ms, 288): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] Processing event at (T0 + 0 ns = 0 ms, 288): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] - Running late by 3967461870 ns = 3967461 µs = 3967 ms
[2022-01-25T14:36:48Z TRACE] - Level 6
[2022-01-25T14:36:48Z TRACE] - Executing /manager/2 (level 6)
[2022-01-25T14:36:48Z TRACE] Pushing at (T0 + 0 ns = 0 ms, 289): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] Processing event at (T0 + 0 ns = 0 ms, 289): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] - Running late by 3967512653 ns = 3967512 µs = 3967 ms
[2022-01-25T14:36:48Z TRACE] - Level 6
[2022-01-25T14:36:48Z TRACE] - Executing /manager/2 (level 6)
The
[2022-01-25T14:36:48Z TRACE] Processing event at (T0 + 0 ns = 0 ms, 285): run [6: {/manager/2}]
[2022-01-25T14:36:48Z TRACE] - Running late by 3967310524 ns = 3967310 µs = 3967 ms
[2022-01-25T14:36:48Z TRACE] - Level 6
[2022-01-25T14:36:48Z TRACE] - Executing /manager/2 (level 6)
bits repeat forever.
I've been working on implementing the MatMul benchmark, and have come across a bug, where an action isn't scheduled, even though it should be. The important reaction is here. When data
is set on line 93, the connected reactions in Worker
trigger, but the reaction to next
seems to never trigger, even though it is scheduled at the end of the same reaction.
This problem disappears when I remove the parallel-runtime
feature.
Here is the execution trace:
[2021-12-17T11:35:38Z INFO ] Assembling runner
[2021-12-17T11:35:38Z INFO ] Assembling manager
[2021-12-17T11:35:38Z INFO ] Assembling workers
[2021-12-17T11:35:38Z INFO ] Registering workers
[2021-12-17T11:35:38Z INFO ] Registering manager
[2021-12-17T11:35:38Z INFO ] Registering runner
[2021-12-17T11:35:38Z INFO ] Triggering startup...
[2021-12-17T11:35:38Z TRACE] - Level 1
[2021-12-17T11:35:38Z TRACE] - Executing /runner/0
[2021-12-17T11:35:38Z TRACE] - Executing /0
Benchmark: MatMulBenchmark
Arguments:
numIterations = 12
dataLength = 1024
[2021-12-17T11:35:38Z TRACE] - Executing /manager/0
blockThreshold = 16384
priorities = 10
numWorkers = 20
System information:
O/S Name = Linux
[2021-12-17T11:35:38Z TRACE] Pushing at (T0 + 0 ns = 0 ms, 1): run [2: {/runner/1}]
[2021-12-17T11:35:38Z TRACE] Processing event at (T0 + 0 ns = 0 ms, 1): run [2: {/runner/1}]
[2021-12-17T11:35:38Z TRACE] - Running late by 136662519 ns = 136662 µs = 136 ms
[2021-12-17T11:35:38Z TRACE] - Level 2
[2021-12-17T11:35:38Z TRACE] - Executing /runner/1 (level 2)
[2021-12-17T11:35:38Z TRACE] - Level 5
[2021-12-17T11:35:38Z TRACE] - Executing /manager/1 (level 5)
[2021-12-17T11:35:38Z TRACE] - Level 8
[2021-12-17T11:35:38Z TRACE] - Executing /workers[15]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[8]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[3]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[4]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[17]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[2]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[9]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[16]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[0]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[19]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[5]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[13]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[12]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[11]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[14]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[6]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[18]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[10]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[1]/0
[2021-12-17T11:35:38Z TRACE] - Executing /workers[7]/0
[2021-12-17T11:35:38Z TRACE] Will wait for asynchronous event without timeout
[2021-12-17T11:35:38Z INFO ] Event queue is empty forever, shutting down.
[2021-12-17T11:35:38Z INFO ] Scheduler is shutting down, at (T0 + 137417726 ns = 137 ms, 0)
[2021-12-17T11:35:38Z INFO ] Scheduler has been shut down
For all the runtime implementations so far, we've used reactor-<file-extension>
. Should we do the same for the Rust implementation?
lf-lang/lingua-franca#1228 contains a test called ReadOutputOfContainedBank
for which the last reaction, that has two triggers, fires twice. It should only be called once however, since triggers should be deduplicated.
Sparked by lf-lang/lingua-franca#1098, it would be useful to have a function to retrieve the current number of workers.
Something like ReactionCtx::num_workers
.
See #20
The bounded buffer benchmark doesn't work if parallelised. It quits before any output is produced by the benchmark runner.
Blue here is Rust.
Curiously, I've been toying with the merge_plans_after
function and wrote a few different variants of it. You can find them on the mpa-var*
branches. I did some benchmarking and accidentally discovered that this bug does not exist for variation 3 and 5.
It appears to have something to do with reaction plans being potentially discarded on merging.
There was a bug in the C++ runtime, and it can also happen in Rust theoretically (I wasn't able to reproduce it with an unmodified runtime, it depends on thread interleaving).
In C++ there is a global event queue and a global mutex protecting it. The fix is to put the time reading and the pushing of the event in the same critical section.
In Rust the event queue is split:
Sender
to push events to the scheduler asynchronously. The Receiver
end maintains an unsorted buffer of events that is periodically flushed into the main queue by the scheduler thread. Events pushed through the Sender
have already been assigned a tag.We can assume Sender
/Receiver
communicate atomically.
We could reproduce the C++ solution by introducing a mutex to guard the receiver and sender. This would however defeat part of the purpose of using channels, which is that we don't need to block the async sender thread when sending something.
Another solution would be to let the scheduler thread assign tags to asynchronous events. There are several possible problems with this:
We could use the asynchronously assigned tag as long as it is greater than the latest processed tag. If it isn't, then we're in the problematic situation described above. Then, we can do something else:
None of these look super appealing in the general case - maybe it should be selectable
Clippy finds a lot of issues with the current state of the code. That should be addressed.
The Philosophers benchmark I ported from C++ has an issue, where during execution it deadlocks and runs without ever finishing, but only some of the time.
If the first iteration of the benchmark succeeds, all of them do, which suggests this is an issue that occurs during initialisation.
Attached is an execution trace of such a deadlocked run.
philosophers_fail.log
While porting the FilterBank benchmark, I noticed that it uses the ports of a bank as dependency for a reaction. This is what that looks like: https://github.com/lf-lang/lingua-franca/blob/master/benchmark/Cpp/Savina/src/parallelism/FilterBank.lf#L299.
When trying to implement this in Rust, the generated code looks like this:
// --- reaction(startup) -> banks.setF, banks.setH {= ... =}
fn react_0(&mut self,
#[allow(unused)] ctx: &mut ::reactor_rt::ReactionCtx,
banks__setF: ::reactor_rt::WritablePort<Matrix<f64>>,
banks__setH: ::reactor_rt::WritablePort<Matrix<f64>>,)
{ ... }
If my understanding is correct, then banks__setF
and banks__setH:
should be WritablePortBank
s.
Currently the repository name is reactor-rs
and the crate name is reactor-rt
(rt for runtime). I think this is pretty confusing and we should consider renaming the crate. As @oowekyala pointed out in #5 naming the crate reactor-rs
does not make much sense either due to the redundant rs. Probably just reactor
would be a more suitable crate name. This would be in analogy to the reactor
namespace in reactor-cpp.
The C++ runtime supports sparse multiports more efficiently since lf-lang/reactor-cpp#24. The set ports can be queried more efficiently because they're saved in a vector in the multiport. This apparently has significant performance impact on benchmarks like Big.
The Rust runtime cannot support this directly yet -setting a port does not notify any of the downstream ports, and ports do not know the multiports that contain them anyway. I imagine this will be difficult to implement because of the constraints on circular references in Rust. Maybe a solution would be to introduce new data structures to track which port can influence which multiport, which would be prepared during initialization.
Output from GitHub Actions:
info: checking for self-updates
warning: tool `rustfmt` is already installed, remove it from `/home/runner/.cargo/bin`, then run `rustup update` to have rustup manage this tool.
Warning: warning: tool `cargo-fmt` is already installed, remove it from `/home/runner/.cargo/bin`, then run `rustup update` to have rustup manage this tool.
This suggests that the rustfmt
component listed on line 59 of .github/workflows/rust.yml
might be redundant. Is it?
Currently LFC is bound to a specific git revision of the crate, not a semantic version number. Using semver would allow us to publish bugfixes to the runtime without needing to upgrade LFC. But we also would need to maintain compatibility with the code generator, and also commit to the stability of the ReactionCtx and related user-facing APIs. I don't think the crate is stable enough right now to allow this.
If we start doing that, then we need a defined release cycle, possible using beta versions for all major versions that are not directly used by a released LFC version. Otherwise we will either clutter Crates.io with deprecated, unsupported releases.
This function does a lot of cloning and attempts to optimize it, but I suspect the optimizations don't do anything.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.