Git Product home page Git Product logo

Comments (8)

bishopbm1 avatar bishopbm1 commented on May 25, 2024

I did find a work around to this issue. By adding an intermediate task after each main task that points to the final join task it successfully waits for all tasks as expected:

Workflow orquesta-join.yaml:

---
version: 1.0

description: A basic workflow that demonstrate branching and join.

tasks:
  task1:
    action: core.noop
    next:
      - when: <% completed() %>
        do: task2, task3

  task2:
    action: encore.orquesta-join-4
    next:
      - when: <% completed() %>
        do: task2_finish

  task2_finish:
    action: core.echo message="fo fum"
    next:
      - when: <% completed() %>
        do: finish

  task3:
    action: encore.orquesta-join-2
    next:
      - when: <% completed() %>
        do: task3_finish

  task3_finish:
    action: core.noop
    next:
      - when: <% completed() %>
        do: finish

  finish:
    join: all
    action: core.noop
    next:
      - do: noop

I don't know if maybe this needs to be a separate issue or not but notice that i used <% completed() %> in all the above examples. You get an error if you try to use <% succeeded() %> and <% failed() %> tasks

Example:

---
version: 1.0

description: A basic workflow that demonstrate branching and join.

tasks:
  task1:
    action: core.noop
    next:
      - when: <% completed() %>
        do: task2, task3

  task2:
    action: encore.orquesta-join-4
    next:
      - when: <% succeeded() %>
        do: task2_finish
      - when: <% failed() %>
        do: task2_finish

  task2_finish:
    action: core.echo message="fo fum"
    next:
      - when: <% succeeded() %>
        do: finish

  task3:
    action: encore.orquesta-join-2
    next:
      - when: <% succeeded() %>
        do: task3_finish
      - when: <% failed() %>
        do: task3_finish

  task3_finish:
    action: core.noop
    next:
      - when: <% succeeded() %>
        do: finish

  finish:
    join: all
    action: core.noop
    next:
      - do: noop

Error:

id: 5e665f0a49b3d26f3749e3cd
action.ref: encore.orquesta-join
parameters: None
status: failed
start_timestamp: Mon, 09 Mar 2020 15:21:46 UTC
end_timestamp: Mon, 09 Mar 2020 15:21:47 UTC
result: 
  errors:
  - message: The join task "finish" is unreachable. A join task is determined to be unreachable if there are nested forks from multi-referenced tasks that join on the said task. This is ambiguous to the workflow engine because it does not know at which level should the join occurs.
    schema_path: properties.tasks.patternProperties.^\w+$
    spec_path: tasks.finish
    type: semantic
  output: null

from orquesta.

m4dcoder avatar m4dcoder commented on May 25, 2024

@bishopbm1 @nmaludy I have no trouble executing workflow with join. So let's start with the simple one and build on it. Can you post the output of the st2 execution get for the workflow below?

version: 1.0

description: A basic join workflow.

tasks:
  task1:
    action: core.noop
    next:
      - when: <% completed() %>
        do: task2, task3

  task2:
    action: core.noop
    next:
      - when: <% completed() %>
        do: task4

  task3:
    action: core.noop
    next:
      - when: <% completed() %>
        do: task4

  task4:
    join: all
    action: core.noop
ubuntu@cadmus:~/st2$ st2 execution get 5e66eb3a26f9d5f01c88da7b
id: 5e66eb3a26f9d5f01c88da7b
action.ref: sandbox.orquesta_join
parameters: None
status: succeeded (4s elapsed)
start_timestamp: Tue, 10 Mar 2020 01:19:54 UTC
end_timestamp: Tue, 10 Mar 2020 01:19:58 UTC
result: 
  output: null
+--------------------------+------------------------+-------+-----------+-----------------+
| id                       | status                 | task  | action    | start_timestamp |
+--------------------------+------------------------+-------+-----------+-----------------+
| 5e66eb3b188ad293e21e5749 | succeeded (0s elapsed) | task1 | core.noop | Tue, 10 Mar     |
|                          |                        |       |           | 2020 01:19:55   |
|                          |                        |       |           | UTC             |
| 5e66eb3c188ad293e21e574c | succeeded (0s elapsed) | task2 | core.noop | Tue, 10 Mar     |
|                          |                        |       |           | 2020 01:19:56   |
|                          |                        |       |           | UTC             |
| 5e66eb3c188ad293e21e574f | succeeded (1s elapsed) | task3 | core.noop | Tue, 10 Mar     |
|                          |                        |       |           | 2020 01:19:56   |
|                          |                        |       |           | UTC             |
| 5e66eb3d188ad293e21e5752 | succeeded (1s elapsed) | task4 | core.noop | Tue, 10 Mar     |
|                          |                        |       |           | 2020 01:19:57   |
|                          |                        |       |           | UTC             |
+--------------------------+------------------------+-------+-----------+-----------------+

from orquesta.

m4dcoder avatar m4dcoder commented on May 25, 2024

Also, there's a core st2 timeout of 60 seconds for action execution. What happens if you set the timeout for sleep actions to > 100 seconds?

from orquesta.

m4dcoder avatar m4dcoder commented on May 25, 2024

Also because you used <% completed() %> in the task transitions which mean regardless of the status of the action execution. The sleep action timed out (> 60 seconds), the action execution will be returned as timeout. This timeout will trigger <% completed() %> and then trigger the join task. What if you change from <% completed() %> to <% succeeded() %> for the task transition?

from orquesta.

bishopbm1 avatar bishopbm1 commented on May 25, 2024

@m4dcoder I understand that there is a 60 second timeout so that the workflow would timeout waiting for the 100 second sleep. That is ok and expected. The only way the issue shows up is if you execute all the workflows. If you notice the master workflow orquesta-join.yaml and the screenshot that I posted it uses all the workflows that were provided.

This issue only seems to appear if you are using nested workflows that have a long running task. It is expected that the one task times out which should be read as a failure but the final finish task is not executed even with the <% completed() %> and the workflow status is failed regardless of the fact that the final task has do: noop so it should exit success no matter what.

from orquesta.

m4dcoder avatar m4dcoder commented on May 25, 2024

@bishopbm1 I am able to reproduce the issue where the join task does not get executed even when task2 and task3 failed even though the transition is using <% completed() %> which should transition for both success or failed. The issue is not present if task2 and task3 succeeded.

ubuntu@cadmus:~/st2$ st2 execution get 5e67e41510939f018f19d6d3
id: 5e67e41510939f018f19d6d3
action.ref: sandbox.orquesta_join
parameters: None
status: failed (74s elapsed)
start_timestamp: Tue, 10 Mar 2020 19:01:41 UTC
end_timestamp: Tue, 10 Mar 2020 19:02:55 UTC
result: 
  errors:
  - message: Execution failed. See result for details.
    result:
      errors:
      - message: Execution failed. See result for details.
        result:
          errors: []
          output: null
        task_id: task2
        type: error
      output: null
    task_id: task2
    type: error
  - message: Execution failed. See result for details.
    result:
      errors:
      - message: Execution failed. See result for details.
        result:
          errors: []
          output: null
        task_id: task2
        type: error
      output: null
    task_id: task3
    type: error
  output: null
+------------------------------+------------------------+-------+--------------------+-----------------+
| id                           | status                 | task  | action             | start_timestamp |
+------------------------------+------------------------+-------+--------------------+-----------------+
|   5e67e4154f9d87277bbfea2c   | succeeded (1s elapsed) | task1 | core.noop          | Tue, 10 Mar     |
|                              |                        |       |                    | 2020 19:01:41   |
|                              |                        |       |                    | UTC             |
| + 5e67e4164f9d87277bbfea2f   | failed (72s elapsed)   | task2 | sandbox.orquesta_j | Tue, 10 Mar     |
|                              |                        |       | oin_2              | 2020 19:01:42   |
|                              |                        |       |                    | UTC             |
|    5e67e4184f9d87277bbfea38  | succeeded (1s elapsed) | task1 | core.echo          | Tue, 10 Mar     |
|                              |                        |       |                    | 2020 19:01:43   |
|                              |                        |       |                    | UTC             |
|  + 5e67e4194f9d87277bbfea3e  | failed (68s elapsed)   | task2 | sandbox.orquesta_j | Tue, 10 Mar     |
|                              |                        |       | oin_3              | 2020 19:01:45   |
|                              |                        |       |                    | UTC             |
|     5e67e41d4f9d87277bbfea44 | succeeded (1s elapsed) | task1 | core.echo          | Tue, 10 Mar     |
|                              |                        |       |                    | 2020 19:01:49   |
|                              |                        |       |                    | UTC             |
|     5e67e4204f9d87277bbfea4a | timeout (60s elapsed)  | task2 | core.local         | Tue, 10 Mar     |
|                              |                        |       |                    | 2020 19:01:52   |
|                              |                        |       |                    | UTC             |
| + 5e67e4164f9d87277bbfea32   | failed (73s elapsed)   | task3 | sandbox.orquesta_j | Tue, 10 Mar     |
|                              |                        |       | oin_4              | 2020 19:01:42   |
|                              |                        |       |                    | UTC             |
|    5e67e4174f9d87277bbfea36  | succeeded (1s elapsed) | task1 | core.echo          | Tue, 10 Mar     |
|                              |                        |       |                    | 2020 19:01:43   |
|                              |                        |       |                    | UTC             |
|  + 5e67e4194f9d87277bbfea3b  | failed (70s elapsed)   | task2 | sandbox.orquesta_j | Tue, 10 Mar     |
|                              |                        |       | oin_3              | 2020 19:01:45   |
|                              |                        |       |                    | UTC             |
|     5e67e41d4f9d87277bbfea41 | succeeded (1s elapsed) | task1 | core.echo          | Tue, 10 Mar     |
|                              |                        |       |                    | 2020 19:01:49   |
|                              |                        |       |                    | UTC             |
|     5e67e41f4f9d87277bbfea48 | timeout (61s elapsed)  | task2 | core.local         | Tue, 10 Mar     |
|                              |                        |       |                    | 2020 19:01:51   |
|                              |                        |       |                    | UTC             |
+------------------------------+------------------------+-------+--------------------+-----------------+

from orquesta.

m4dcoder avatar m4dcoder commented on May 25, 2024

Note, this issue only occurs when both branches failed. If one branch succeeds and the other fails, the join task is triggered.

ubuntu@cadmus:~/st2$ st2 execution get 5e67e63cfe5c0accaf6d4149
id: 5e67e63cfe5c0accaf6d4149
action.ref: sandbox.orquesta_join
parameters: None
status: succeeded (71s elapsed)
start_timestamp: Tue, 10 Mar 2020 19:10:52 UTC
end_timestamp: Tue, 10 Mar 2020 19:12:03 UTC
result: 
  output: null
+------------------------------+-------------------------+--------+--------------------+-----------------+
| id                           | status                  | task   | action             | start_timestamp |
+------------------------------+-------------------------+--------+--------------------+-----------------+
|   5e67e63ddee7484d1cd13a1b   | succeeded (1s elapsed)  | task1  | core.noop          | Tue, 10 Mar     |
|                              |                         |        |                    | 2020 19:10:53   |
|                              |                         |        |                    | UTC             |
| + 5e67e63edee7484d1cd13a1e   | succeeded (17s elapsed) | task2  | sandbox.orquesta_j | Tue, 10 Mar     |
|                              |                         |        | oin_2              | 2020 19:10:54   |
|                              |                         |        |                    | UTC             |
|    5e67e63fdee7484d1cd13a24  | succeeded (1s elapsed)  | task1  | core.echo          | Tue, 10 Mar     |
|                              |                         |        |                    | 2020 19:10:55   |
|                              |                         |        |                    | UTC             |
|  + 5e67e641dee7484d1cd13a2a  | succeeded (14s elapsed) | task2  | sandbox.orquesta_j | Tue, 10 Mar     |
|                              |                         |        | oin_3              | 2020 19:10:56   |
|                              |                         |        |                    | UTC             |
|     5e67e642dee7484d1cd13a30 | succeeded (1s elapsed)  | task1  | core.echo          | Tue, 10 Mar     |
|                              |                         |        |                    | 2020 19:10:58   |
|                              |                         |        |                    | UTC             |
|     5e67e643dee7484d1cd13a36 | succeeded (11s elapsed) | task2  | core.local         | Tue, 10 Mar     |
|                              |                         |        |                    | 2020 19:10:59   |
|                              |                         |        |                    | UTC             |
| + 5e67e63edee7484d1cd13a21   | failed (67s elapsed)    | task3  | sandbox.orquesta_j | Tue, 10 Mar     |
|                              |                         |        | oin_4              | 2020 19:10:54   |
|                              |                         |        |                    | UTC             |
|    5e67e63fdee7484d1cd13a27  | succeeded (1s elapsed)  | task1  | core.echo          | Tue, 10 Mar     |
|                              |                         |        |                    | 2020 19:10:55   |
|                              |                         |        |                    | UTC             |
|  + 5e67e641dee7484d1cd13a2d  | failed (65s elapsed)    | task2  | sandbox.orquesta_j | Tue, 10 Mar     |
|                              |                         |        | oin_5              | 2020 19:10:57   |
|                              |                         |        |                    | UTC             |
|     5e67e642dee7484d1cd13a33 | succeeded (1s elapsed)  | task1  | core.echo          | Tue, 10 Mar     |
|                              |                         |        |                    | 2020 19:10:58   |
|                              |                         |        |                    | UTC             |
|     5e67e644dee7484d1cd13a39 | timeout (60s elapsed)   | task2  | core.local         | Tue, 10 Mar     |
|                              |                         |        |                    | 2020 19:11:00   |
|                              |                         |        |                    | UTC             |
|   5e67e682dee7484d1cd13a3c   | succeeded (0s elapsed)  | finish | core.echo          | Tue, 10 Mar     |
|                              |                         |        |                    | 2020 19:12:02   |
|                              |                         |        |                    | UTC             |
+------------------------------+-------------------------+--------+--------------------+-----------------+

from orquesta.

m4dcoder avatar m4dcoder commented on May 25, 2024

This issue will be fixed with #194.

from orquesta.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.