Git Product home page Git Product logo

kestra-io / kestra Goto Github PK

View Code? Open in Web Editor NEW
6.4K 60.0 349.0 35.15 MB

Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.

Home Page: https://kestra.io

License: Apache License 2.0

Java 75.98% HTML 0.06% JavaScript 3.65% Vue 18.72% Dockerfile 0.02% Shell 0.03% Batchfile 0.07% SCSS 0.80% CSS 0.05% PLpgSQL 0.49% Makefile 0.13%
workflow workflow-engine orchestration scheduler data-pipeline elt etl data data-engineering data-orchestration

kestra's People

Contributors

aliczin avatar anna-geller avatar ben8t avatar brahimalm avatar brian-mulier-p avatar corentinghigny avatar dependabot[bot] avatar edwardli-coder avatar eregnier avatar fhussonnois avatar kination avatar kriko avatar loicmathieu avatar matvey-ososkov avatar mengzhuo avatar npranav10 avatar nsteinmetz avatar olla-dev avatar sagunn-echo avatar shrutimantri avatar simonpic avatar skraye avatar smantri-moveworks avatar tanay1337 avatar tchiotludo avatar teqsk1514 avatar v1nc3n4 avatar yuri-lima avatar yuri1969 avatar yvrng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kestra's Issues

Update a flow and changing id or namespace failed

Changing the namespace or the id will led to a 404 on the update from the webui.
It's a normal behaviour on the FlowControler we ask and update on a flow that don't exists.

This led to 2 issues :

  • 404 are not catch and loading bar keep growing, this must be capture and displayed as error.
  • How can we rename a flow ? is this a possible action ?

Since there is a lot of "foreign keys" (see #60), we need to disabled the edition of namespace and flow id for now (waiting #60)

⭐ Allow users to rename Flow namespace & id

Since there is a lot of "foreign keys" on flow namespace + flow id, we need to implement a special method to move a flow and change the name.

The ui must be like on github with its "danger zone" (it will break all app using the flow).
The move method must changed the flow namespace & id, since the operation can be very long, it must be async.

Also, when a user want to change the namespace it encounters multiple “Page not found” before understanding that it wasn’t possible. We should at least improve the error message.

Handle large log output

Seems that with Kafka runner, some very long line will kill a consuming worker silently.
No error will be in logs and the consumer is remove for Kafka silently.

Really need to investigate

Display inputs on the general page

On the main page, add a inputs table below the actual with key / value of inputs.
If possible, add a download link on any inputs of type file

Command test don't stop the process at end of the flow

running a simple flow ./kestra test flow.yaml that open a connection to a server for example (keep resource open) will hang the command test (reproduced with SlackExecution command)

We need to find a way to close resource :

  • Keep resource open for a frequent tasks can lead to some performance improvements.
  • But that can lead to overflow the destination server with many connection open and never used, also that can lead to some exception when the server close the connection.

I think the proper way will to close all connection for now.

FlowableTask : handle Exception properly

For now, some exception can be throw during evaluation of any method of FlowableTask interface.
If this exception occured, it will break the flow and maybe break the application.

We need to properly capture these exception to be resilient in case of an invalid flow

hello test

Task create temporary file like that :

        // temp file
        File tempFile = File.createTempFile(this.getClass().getSimpleName().toLowerCase() + "_", ".csv");
        ObjectOutputStream output = new ObjectOutputStream(new FileOutputStream(tempFile));

It's mostly done for now to handle in order to forward to StorageInterface with a : runContext.putFile(tempFile).getUri()

We need to find a better way to have StorageObject directly without intermediate temporary files or at least delete the temporary files.

Capture full stacktrace to logs

For now only the message exception is saved on execution.
Capture recursively cause message and add the full stacktrace is TRACE log

This is a test issue to validate my self hosted instance

id: errors
namespace: org.floworc.tests

tasks:
- id: failed
  type: org.floworc.core.tasks.scripts.Bash
  commands:
  - 'exit 1'
  errors:
  - id: 2nd
    type: org.floworc.core.tasks.debugs.Echo
    format: second {{task.id}}

The errors will not be triggers, need to be global to work or on FlowableTask

Start from source

Hello~
Is there a guide to start working via source code? I couldn't find in document...

Thanks.

Log must be realtime

For now, log are emitted only at the end of the current task.
They must be real time and emitted as soon as possible to be displayed on the web ui

StorageObject have to many ///

Currently the uri generated by StorageInterface have to many slash, example :
floworc:////org/floworc/tests/inputs/executions/test/inputs/file/application.yml.

We need to have only : floworc:///org/floworc/tests/inputs/executions/test/inputs/file/application.yml. like that : scheme://{nohost}/org/floworc

final test

Allow to configure :

  • errors
  • timeout
  • retry

for :

  • global configuration
  • per tasks type configuration
  • per namespace
  • per flow

Inconsistent metric

In some metrics, namespace is in a label "namespace" (eg: kestra_org_kestra_core_tasks_debugs_return_duration_seconds_count) and in other, it is in the label "namespace_id" (eg: kestra_executor_execution_duration_seconds_count)

Every command on cli will generate temp folders

for every command on cli, this code from TestCommand will be run :

    @SuppressWarnings("unused")
    public static Map<String, Object> propertiesOverrides() {
        try {
            Path tempDirectory = Files.createTempDirectory(TestCommand.class.getSimpleName());

            return ImmutableMap.of(
                "kestra.repository.type", "memory",
                "kestra.queue.type", "memory",
                "kestra.storage.type", "local",
                "kestra.storage.local.base-path", tempDirectory.toAbsolutePath().toString()
            );
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }

and will create a temp folders, must be created only on test command

Trigger execution from webui don't display the execution when some lag

When we create the execution from the webui, the execution is created and return directly.
But the indexer could not have the time to index the execution on the backend.
This lead by a 404 on /api/v1/executions/{{executionId}} and the ui freeze here.

Difficult use case here, I think the better will be to use only the execution return during the creation and reach the updated information from the listen url (without trying to reach the url below).
This will lead to no dependency on indexer on this part of the webui.

Cleanup RunContext

Refactor the RunContext, the object is mutable right now and will not be usable in whole case.
For example : ApplicationContext is need for some method and can be null.

Maybe implement a WorkerRunContext that extends RunContext

Execution status randomly stuck in "created" state just after a flow is triggered

After triggering a flow, the Ui calls 2 http routes, first executions/{executionId}/trigger then executions/{executionId}/follow

This problem occurs sometimes when an execution is so fast that the second call (follow) is done when the execution is already terminated.

Currently the follow route, does not handle this case and always wait for the passed execution to terminate ...

What must be done :

  • Update the follow route to handle this case by checking the execution state and returning without waiting ...

Just curious about choice of language

Hi,

I wanted to reach out over email, but your site is under development. I'm curious about one thing -

Why are most of the orchestration engines I see on GitHub built using Java? This seems to be a common trait among them.

Could you please detail some of the reasons why you chose Java over, say, go or python? Is it just that you are more familiar with the language?

API route for Flow revisions

FlowRepository implements a method :

    List<Flow> findRevisions(String namespace, String id);

It would be great to be able to use it 😄 (I need to find a culprit for wasting 3 hours of my life - might be me, probably me, but it least I'll be able to sleep on both ears)

Prerequisite for #62

Add "restartable" property to TaskRun

Add a "restartable" property to TaskRun.

Currently the fact that a taskRun can be restarted for an execution is computed on Ui side ...

Ui Side : See 'Restart' component

Switch task with no possible case throw a NPE

If there isn't any possibility in the Switch FlowableTask, the task throw an NPE, and the task remain "running"

2019-12-26 15:12:01,892 INFO  mory-queue-4 l.parent-seq [execution: 3Pf6uUWsqTo2jjynz1T9As] [taskrun: 2M4fE75in0WRObmtLX8lTJ] Task parent-seq (type: Switch) started
Exception in thread "memory-queue-7" java.lang.NullPointerException
	at org.kestra.core.runners.FlowableUtils.resolveState(FlowableUtils.java:76)
	at org.kestra.core.tasks.flows.Switch.resolveState(Switch.java:58)
	at org.kestra.core.runners.AbstractExecutor.childWorkerTaskResult(AbstractExecutor.java:60)
	at org.kestra.runner.memory.MemoryExecutor.lambda$handlChild$5(MemoryExecutor.java:133)
	at java.base/java.util.stream.ReferencePipeline$11$1.accept(ReferencePipeline.java:441)
	at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
	at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
	at org.kestra.runner.memory.MemoryExecutor.handlChild(MemoryExecutor.java:136)
	at org.kestra.runner.memory.MemoryExecutor.lambda$run$1(MemoryExecutor.java:59)
	at org.kestra.runner.memory.MemoryQueue.lambda$emit$0(MemoryQueue.java:45)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

This must failed the task properly

Uniformise version number in the whole app

Actually, some files have version hardcoded :

  • org/kestra/webserver/Application.java
  • org/kestra/cli/App.java
  • package.json

We need to find a way to only rely on gradle.properties

Display revision on flow

Display a new tab on flow with a list of revision.
On click one revision, display a popup with comparison between previous & selected version.
Add a select on top in order to allow user to change revision for comparison.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.