Comments (8)
I did find a work around to this issue. By adding an intermediate task after each main task that points to the final join task it successfully waits for all tasks as expected:
Workflow orquesta-join.yaml:
---
version: 1.0
description: A basic workflow that demonstrate branching and join.
tasks:
task1:
action: core.noop
next:
- when: <% completed() %>
do: task2, task3
task2:
action: encore.orquesta-join-4
next:
- when: <% completed() %>
do: task2_finish
task2_finish:
action: core.echo message="fo fum"
next:
- when: <% completed() %>
do: finish
task3:
action: encore.orquesta-join-2
next:
- when: <% completed() %>
do: task3_finish
task3_finish:
action: core.noop
next:
- when: <% completed() %>
do: finish
finish:
join: all
action: core.noop
next:
- do: noop
I don't know if maybe this needs to be a separate issue or not but notice that i used <% completed() %>
in all the above examples. You get an error if you try to use <% succeeded() %>
and <% failed() %>
tasks
Example:
---
version: 1.0
description: A basic workflow that demonstrate branching and join.
tasks:
task1:
action: core.noop
next:
- when: <% completed() %>
do: task2, task3
task2:
action: encore.orquesta-join-4
next:
- when: <% succeeded() %>
do: task2_finish
- when: <% failed() %>
do: task2_finish
task2_finish:
action: core.echo message="fo fum"
next:
- when: <% succeeded() %>
do: finish
task3:
action: encore.orquesta-join-2
next:
- when: <% succeeded() %>
do: task3_finish
- when: <% failed() %>
do: task3_finish
task3_finish:
action: core.noop
next:
- when: <% succeeded() %>
do: finish
finish:
join: all
action: core.noop
next:
- do: noop
Error:
id: 5e665f0a49b3d26f3749e3cd
action.ref: encore.orquesta-join
parameters: None
status: failed
start_timestamp: Mon, 09 Mar 2020 15:21:46 UTC
end_timestamp: Mon, 09 Mar 2020 15:21:47 UTC
result:
errors:
- message: The join task "finish" is unreachable. A join task is determined to be unreachable if there are nested forks from multi-referenced tasks that join on the said task. This is ambiguous to the workflow engine because it does not know at which level should the join occurs.
schema_path: properties.tasks.patternProperties.^\w+$
spec_path: tasks.finish
type: semantic
output: null
from orquesta.
@bishopbm1 @nmaludy I have no trouble executing workflow with join. So let's start with the simple one and build on it. Can you post the output of the st2 execution get
for the workflow below?
version: 1.0
description: A basic join workflow.
tasks:
task1:
action: core.noop
next:
- when: <% completed() %>
do: task2, task3
task2:
action: core.noop
next:
- when: <% completed() %>
do: task4
task3:
action: core.noop
next:
- when: <% completed() %>
do: task4
task4:
join: all
action: core.noop
ubuntu@cadmus:~/st2$ st2 execution get 5e66eb3a26f9d5f01c88da7b
id: 5e66eb3a26f9d5f01c88da7b
action.ref: sandbox.orquesta_join
parameters: None
status: succeeded (4s elapsed)
start_timestamp: Tue, 10 Mar 2020 01:19:54 UTC
end_timestamp: Tue, 10 Mar 2020 01:19:58 UTC
result:
output: null
+--------------------------+------------------------+-------+-----------+-----------------+
| id | status | task | action | start_timestamp |
+--------------------------+------------------------+-------+-----------+-----------------+
| 5e66eb3b188ad293e21e5749 | succeeded (0s elapsed) | task1 | core.noop | Tue, 10 Mar |
| | | | | 2020 01:19:55 |
| | | | | UTC |
| 5e66eb3c188ad293e21e574c | succeeded (0s elapsed) | task2 | core.noop | Tue, 10 Mar |
| | | | | 2020 01:19:56 |
| | | | | UTC |
| 5e66eb3c188ad293e21e574f | succeeded (1s elapsed) | task3 | core.noop | Tue, 10 Mar |
| | | | | 2020 01:19:56 |
| | | | | UTC |
| 5e66eb3d188ad293e21e5752 | succeeded (1s elapsed) | task4 | core.noop | Tue, 10 Mar |
| | | | | 2020 01:19:57 |
| | | | | UTC |
+--------------------------+------------------------+-------+-----------+-----------------+
from orquesta.
Also, there's a core st2 timeout of 60 seconds for action execution. What happens if you set the timeout for sleep actions to > 100 seconds?
from orquesta.
Also because you used <% completed() %>
in the task transitions which mean regardless of the status of the action execution. The sleep action timed out (> 60 seconds), the action execution will be returned as timeout. This timeout will trigger <% completed() %>
and then trigger the join task. What if you change from <% completed() %>
to <% succeeded() %>
for the task transition?
from orquesta.
@m4dcoder I understand that there is a 60 second timeout so that the workflow would timeout waiting for the 100 second sleep. That is ok and expected. The only way the issue shows up is if you execute all the workflows. If you notice the master workflow orquesta-join.yaml
and the screenshot that I posted it uses all the workflows that were provided.
This issue only seems to appear if you are using nested workflows that have a long running task. It is expected that the one task times out which should be read as a failure but the final finish task is not executed even with the <% completed() %>
and the workflow status is failed regardless of the fact that the final task has do: noop
so it should exit success no matter what.
from orquesta.
@bishopbm1 I am able to reproduce the issue where the join task does not get executed even when task2 and task3 failed even though the transition is using <% completed() %>
which should transition for both success or failed. The issue is not present if task2 and task3 succeeded.
ubuntu@cadmus:~/st2$ st2 execution get 5e67e41510939f018f19d6d3
id: 5e67e41510939f018f19d6d3
action.ref: sandbox.orquesta_join
parameters: None
status: failed (74s elapsed)
start_timestamp: Tue, 10 Mar 2020 19:01:41 UTC
end_timestamp: Tue, 10 Mar 2020 19:02:55 UTC
result:
errors:
- message: Execution failed. See result for details.
result:
errors:
- message: Execution failed. See result for details.
result:
errors: []
output: null
task_id: task2
type: error
output: null
task_id: task2
type: error
- message: Execution failed. See result for details.
result:
errors:
- message: Execution failed. See result for details.
result:
errors: []
output: null
task_id: task2
type: error
output: null
task_id: task3
type: error
output: null
+------------------------------+------------------------+-------+--------------------+-----------------+
| id | status | task | action | start_timestamp |
+------------------------------+------------------------+-------+--------------------+-----------------+
| 5e67e4154f9d87277bbfea2c | succeeded (1s elapsed) | task1 | core.noop | Tue, 10 Mar |
| | | | | 2020 19:01:41 |
| | | | | UTC |
| + 5e67e4164f9d87277bbfea2f | failed (72s elapsed) | task2 | sandbox.orquesta_j | Tue, 10 Mar |
| | | | oin_2 | 2020 19:01:42 |
| | | | | UTC |
| 5e67e4184f9d87277bbfea38 | succeeded (1s elapsed) | task1 | core.echo | Tue, 10 Mar |
| | | | | 2020 19:01:43 |
| | | | | UTC |
| + 5e67e4194f9d87277bbfea3e | failed (68s elapsed) | task2 | sandbox.orquesta_j | Tue, 10 Mar |
| | | | oin_3 | 2020 19:01:45 |
| | | | | UTC |
| 5e67e41d4f9d87277bbfea44 | succeeded (1s elapsed) | task1 | core.echo | Tue, 10 Mar |
| | | | | 2020 19:01:49 |
| | | | | UTC |
| 5e67e4204f9d87277bbfea4a | timeout (60s elapsed) | task2 | core.local | Tue, 10 Mar |
| | | | | 2020 19:01:52 |
| | | | | UTC |
| + 5e67e4164f9d87277bbfea32 | failed (73s elapsed) | task3 | sandbox.orquesta_j | Tue, 10 Mar |
| | | | oin_4 | 2020 19:01:42 |
| | | | | UTC |
| 5e67e4174f9d87277bbfea36 | succeeded (1s elapsed) | task1 | core.echo | Tue, 10 Mar |
| | | | | 2020 19:01:43 |
| | | | | UTC |
| + 5e67e4194f9d87277bbfea3b | failed (70s elapsed) | task2 | sandbox.orquesta_j | Tue, 10 Mar |
| | | | oin_3 | 2020 19:01:45 |
| | | | | UTC |
| 5e67e41d4f9d87277bbfea41 | succeeded (1s elapsed) | task1 | core.echo | Tue, 10 Mar |
| | | | | 2020 19:01:49 |
| | | | | UTC |
| 5e67e41f4f9d87277bbfea48 | timeout (61s elapsed) | task2 | core.local | Tue, 10 Mar |
| | | | | 2020 19:01:51 |
| | | | | UTC |
+------------------------------+------------------------+-------+--------------------+-----------------+
from orquesta.
Note, this issue only occurs when both branches failed. If one branch succeeds and the other fails, the join task is triggered.
ubuntu@cadmus:~/st2$ st2 execution get 5e67e63cfe5c0accaf6d4149
id: 5e67e63cfe5c0accaf6d4149
action.ref: sandbox.orquesta_join
parameters: None
status: succeeded (71s elapsed)
start_timestamp: Tue, 10 Mar 2020 19:10:52 UTC
end_timestamp: Tue, 10 Mar 2020 19:12:03 UTC
result:
output: null
+------------------------------+-------------------------+--------+--------------------+-----------------+
| id | status | task | action | start_timestamp |
+------------------------------+-------------------------+--------+--------------------+-----------------+
| 5e67e63ddee7484d1cd13a1b | succeeded (1s elapsed) | task1 | core.noop | Tue, 10 Mar |
| | | | | 2020 19:10:53 |
| | | | | UTC |
| + 5e67e63edee7484d1cd13a1e | succeeded (17s elapsed) | task2 | sandbox.orquesta_j | Tue, 10 Mar |
| | | | oin_2 | 2020 19:10:54 |
| | | | | UTC |
| 5e67e63fdee7484d1cd13a24 | succeeded (1s elapsed) | task1 | core.echo | Tue, 10 Mar |
| | | | | 2020 19:10:55 |
| | | | | UTC |
| + 5e67e641dee7484d1cd13a2a | succeeded (14s elapsed) | task2 | sandbox.orquesta_j | Tue, 10 Mar |
| | | | oin_3 | 2020 19:10:56 |
| | | | | UTC |
| 5e67e642dee7484d1cd13a30 | succeeded (1s elapsed) | task1 | core.echo | Tue, 10 Mar |
| | | | | 2020 19:10:58 |
| | | | | UTC |
| 5e67e643dee7484d1cd13a36 | succeeded (11s elapsed) | task2 | core.local | Tue, 10 Mar |
| | | | | 2020 19:10:59 |
| | | | | UTC |
| + 5e67e63edee7484d1cd13a21 | failed (67s elapsed) | task3 | sandbox.orquesta_j | Tue, 10 Mar |
| | | | oin_4 | 2020 19:10:54 |
| | | | | UTC |
| 5e67e63fdee7484d1cd13a27 | succeeded (1s elapsed) | task1 | core.echo | Tue, 10 Mar |
| | | | | 2020 19:10:55 |
| | | | | UTC |
| + 5e67e641dee7484d1cd13a2d | failed (65s elapsed) | task2 | sandbox.orquesta_j | Tue, 10 Mar |
| | | | oin_5 | 2020 19:10:57 |
| | | | | UTC |
| 5e67e642dee7484d1cd13a33 | succeeded (1s elapsed) | task1 | core.echo | Tue, 10 Mar |
| | | | | 2020 19:10:58 |
| | | | | UTC |
| 5e67e644dee7484d1cd13a39 | timeout (60s elapsed) | task2 | core.local | Tue, 10 Mar |
| | | | | 2020 19:11:00 |
| | | | | UTC |
| 5e67e682dee7484d1cd13a3c | succeeded (0s elapsed) | finish | core.echo | Tue, 10 Mar |
| | | | | 2020 19:12:02 |
| | | | | UTC |
+------------------------------+-------------------------+--------+--------------------+-----------------+
from orquesta.
This issue will be fixed with #194.
from orquesta.
Related Issues (20)
- Workflow execution with error sometimes HOT 5
- Workflow stuck in running state
- Exception running workflow - 'ValueError: malformed node or string: <_ast.BinOp object at 0x7f80d29946d8>' HOT 3
- The ujson 2.0.x doesn't compatible with Orquesta HOT 1
- Retries using with-items runs a retry even on objects that succeeded as well HOT 5
- Add ability in task spec to wait for a lock before proceeding HOT 4
- Join failure within nested workflows can cause Parent workflow to run indefinitely.
- Join ALL and conditional branches conflict? HOT 6
- Incomplete next staged concurrent task with items if last running nested item fails.
- KeyValue DataStore does not load in Workflow HOT 2
- 'Inspect the workflow spec' and 'Instantiate the workflow conductor' take too long HOT 7
- Workflow join is not properly working if one step fails HOT 4
- Disable action notify triggers when action is executed under workflow context HOT 3
- Naming a workflow task "get_task" causes "'TaskSpec' object is not callable" error
- Valid YAQL in With Items Input Fails HOT 8
- JOIN all with condition HOT 2
- Investigate using `rustworkx` instead of `networkx` HOT 1
- Workflow stuck with concurrency value of 0.
- Orquesta workflow inquiries responder user ID not found
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from orquesta.