Git Product home page Git Product logo

suryatmodulus / kestra Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kestra-io/kestra

0.0 1.0 0.0 6.75 MB

Kestra is an infinitely scalable opensource orchestration and scheduling platform, creating, running, scheduling, and monitoring millions of complex pipelines.

Home Page: https://kestra.io

License: Apache License 2.0

Java 81.33% HTML 0.07% JavaScript 3.57% Vue 13.82% Dockerfile 0.03% Shell 0.04% Batchfile 0.07% Handlebars 0.18% SCSS 0.74% Python 0.05% CSS 0.10%

kestra's Introduction

Kestra workflow orchestrator

Infinitely scalable open source orchestration & scheduling platform.

License Commits-per-month Github star Last Version Docker pull Artifact Hub Kestra infinitely scalable orchestration and scheduling platform Discord Github discussions Twitter Code Cov Github Actions

WebsiteTwitterLinked InDiscordDocumentation


modern data orchestration and scheduling platform

Demo

Play with our demo app!

What is Kestra ?

Kestra is an infinitely scalable orchestration and scheduling platform, creating, running, scheduling, and monitoring millions of complex pipelines.

  • 🔀 Any kind of workflow: Workflows can start simple and progress to more complex systems with branching, parallel, dynamic tasks, flow dependencies
  • 🎓‍ Easy to learn: Flows are in simple, descriptive language defined in YAML—you don't need to be a developer to create a new flow.
  • 🔣 Easy to extend: Plugins are everywhere in Kestra, many are available from the Kestra core team, but you can create one easily.
  • 🆙 Any triggers: Kestra is event-based at heart—you can trigger an execution from API, schedule, detection, events
  • 💻 A rich user interface: The built-in web interface allows you to create, run, and monitor all your flows—no need to deploy your flows, just edit them.
  • Enjoy infinite scalability: Kestra is built around top cloud native technologies—scale to millions of executions stress-free.

Example flow:

id: my-first-flow
namespace: my.company.teams

inputs:
  - type: FILE
    name: uploaded
    description: A Csv file to be uploaded through API or UI

tasks:
  - id: archive
    type: io.kestra.plugin.gcp.gcs.Upload
    description: Archive the file on Google Cloud Storage bucket
    from: "{{ inputs.uploaded }}"
    to: "gs://my_bucket/archives/{{ execution.id }}.csv"

  - id: csvReader
    type: io.kestra.plugin.serdes.csv.CsvReader
    from: "{{ inputs.uploaded }}"

  - id: fileTransform
    type: io.kestra.plugin.scripts.nashorn.FileTransform
    description: This task will anonymize the contactName with a custom nashorn script (javascript over jvm). This show that you able to handle custom transformation or remapping in the ETL way
    from: "{{ outputs.csvReader.uri }}"
    script: |
      if (row['contactName']) {
        row['contactName'] = "*".repeat(row['contactName'].length);
      }

  - id: avroWriter
    type: io.kestra.plugin.serdes.avro.AvroWriter
    description: This file will convert the file from Kestra internal storage to avro. Again, we handling ETL since the conversion is done by Kestra before loading the data in BigQuery. This allow you to have some control before loading and to reject wrong data as soon as possible.
    from: "{{ outputs.fileTransform.uri }}"
    schema: |
      {
        "type": "record",
        "name": "Root",
        "fields":
          [
            { "name": "contactTitle", "type": ["null", "string"] },
            { "name": "postalCode", "type": ["null", "long"] },
            { "name": "entityId", "type": ["null", "long"] },
            { "name": "country", "type": ["null", "string"] },
            { "name": "region", "type": ["null", "string"] },
            { "name": "address", "type": ["null", "string"] },
            { "name": "fax", "type": ["null", "string"] },
            { "name": "email", "type": ["null", "string"] },
            { "name": "mobile", "type": ["null", "string"] },
            { "name": "companyName", "type": ["null", "string"] },
            { "name": "contactName", "type": ["null", "string"] },
            { "name": "phone", "type": ["null", "string"] },
            { "name": "city", "type": ["null", "string"] }
          ]
      }

  - id: load
    type: io.kestra.plugin.gcp.bigquery.Load
    description: Simply load the generated from avro task to BigQuery
    avroOptions:
      useAvroLogicalTypes: true
    destinationTable: kestra-prd.demo.customer_copy
    format: AVRO
    from: "{{outputs.avroWriter.uri }}"
    writeDisposition: WRITE_TRUNCATE

  - id: aggregate
    type: io.kestra.plugin.gcp.bigquery.Query
    description: Aggregate some data from loaded files
    createDisposition: CREATE_IF_NEEDED
    destinationTable: kestra-prd.demo.agg
    sql: |
      SELECT k.categoryName, p.productName, c.companyName, s.orderDate, SUM(d.quantity) AS quantity, SUM(d.unitPrice * d.quantity * r.exchange) as totalEur
      FROM `kestra-prd.demo.salesOrder` AS s
      INNER JOIN `kestra-prd.demo.orderDetail` AS d ON s.entityId = d.orderId
      INNER JOIN `kestra-prd.demo.customer` AS c ON c.entityId = s.customerId
      INNER JOIN `kestra-prd.demo.product` AS p ON p.entityId = d.productId
      INNER JOIN `kestra-prd.demo.category` AS k ON k.entityId = p.categoryId
      INNER JOIN `kestra-prd.demo.rates` AS r ON r.date = DATE(s.orderDate) AND r.currency = "USD"
      GROUP BY 1, 2, 3, 4
    timePartitioningField: orderDate
    writeDisposition: WRITE_TRUNCATE

Getting Started

To get a local copy up and running, please follow these simple steps.

Prerequisites

Make sure you have already installed:

Launch Kestra

  • Download the compose file here and save it with the name docker-compose.yml, for linux and macos, you can run wget https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml
  • Run docker-compose pull
  • Run docker-compose up -d
  • Open http://localhost:8080 on your browser
  • Follow this tutorial to create your first flow.
  • Read the documentation to understand how to

Plugins

Kestra is built on plugin systems. You can find your plugin to interact with your provider; alternatively, you can follow simple steps to develop your own plugin. Here are the official plugins that are available:

Amazon S3 Avro Bash
Big Query Cassandra CSV
ClickHouse Debezium MYSQL Debezium Postgres
Debezium Microsoft SQL Server DBT ElasticSearch
Email Google Cloud Storage Google Drive
Google Sheets Groovy Http
JSON Jython Kafka
Kubernetes Microsoft SQL Server MongoDb
MySQL Nashorn Node
Open PGP Oracle Postgres
Python Redshift Snowflake
SFTP Singer Slack
Spark Vectorwise Vertica
XML

This list is growing quickly as we are actively building more plugins, and we welcome contributions!

Community Support

Join our community if you need help, want to chat or have any other questions for us:

  • GitHub - Discussion forums and updates from the Kestra team
  • Twitter - For all the latest Kestra news
  • Discord - Join the conversation! Get all the latest updates and chat to the devs

Roadmap

See the open issues for a list of proposed features (and known issues) or look at the project board.

Developing locally & Contributing

We love contributions big or small, check out our guide on how to get started.

See our Plugin Developer Guide for developing Kestra plugins.

License

Apache 2.0 © Kestra Technologies

kestra's People

Contributors

aurelienwls avatar brahimalm avatar corentinghigny avatar dependabot[bot] avatar eregnier avatar fdelbrayelle avatar kination avatar loris-intergalactique avatar movaid7 avatar tchiotludo avatar v1nc3n4 avatar yuri1969 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.