Git Product home page Git Product logo

flyte's Introduction

Flyte Logo

Current Release Build Status License Commit activity Commit since last release GitHub milestones Completed GitHub next milestone percentage Twitter Follow Slack Status

Flyte is a container-native, type-safe workflow and pipelines platform optimized for large scale processing and machine learning written in Golang. Workflows can be written in any language, with out of the box support for Python.

Homepage

https://flyte.org Docs: https://lyft.github.io/flyte

Introduction

Flyte is a fabric that connects disparate computation backends using a type safe data dependency graph. It records all changes to a pipeline, making it possible to rewind time. It also stores a history of all executions and provides an intuitive UI, CLI and REST/gRPC API to interact with the computation.

Flyte is more than a workflow engine, it provides workflows as a core concepts, but it also provides a single unit of execution - tasks, as a top level concept. Multiple tasks arranged in a data producer-consumer order creates a workflow. Flyte workflows are pure specification and can be created using any language. Every task can also by any language. We do provide first class support for python, making it perfect for modern Machine Learning and Data processing pipelines.

Resources

Resources that would help you get a better understanding of Flyte.

Conference Talks

  • Kubecon 2019 - Flyte: Cloud Native Machine Learning and Data Processing Platform video | deck
  • Kubecon 2019 - Running LargeScale Stateful workloads on Kubernetes at Lyft video
  • re:invent 2019 - Implementing ML workflows with Kubernetes and Amazon Sagemaker video
  • Cloud-native machine learning at Lyft with AWS Batch and Amazon EKS video

Blog Posts

  1. Introducing Flyte: A Cloud Native Machine Learning and Data Processing Platform

Podcasts

Features

  • Used at Scale in production by 500+ users at Lyft with more than 900k workflow executed a month and more than 30+ million container executions per month
  • Centralized Inventory of Tasks, Workflows and Executions
  • Single Task Execution support - Start executing a task and then convert it to a workflow
  • gRPC / REST interface to define and executes tasks and workflows
  • Type safe construction of pipelines, each task has an interface which is characterized by its input and outputs. Thus illegal construction of pipelines fails during declaration rather than at runtime
  • Types that help in creating machine learning and data processing pipelines like - Blobs (images, arbitrary files), Directories, Schema (columnar structured data), collections, maps etc
  • Memoization and Lineage tracking
  • Workflows features
  • Multiple Schedules for every workflow
  • Parallel step execution
  • Extensible Backend to add customized plugin experiences
  • Arbitrary container execution
  • Branching
  • Inline Subworkflows (a workflow can be embeded within one node of the top level workflow)
  • Distributed Remote Child workflows (a remote workflow can be triggered and statically verified at compile time)
  • Array Tasks (map some function over a large dataset, controlled execution of 1000's of containers)
  • Dynamic Workflow creation and execution - with runtime type safety
  • Container side plugins with first class support in python
  • Maintain an inventory of tasks and workflows
  • Record history of all executions and executions (as long as they follow convention) are completely repeatable
  • Multi Cloud support (AWS, GCP and others)
  • Extensible core
  • Modularized
  • Automated notifications to Slack, Email, Pagerduty
  • Deep observability
  • Multi K8s cluster support
  • Comes with many system supported out of the box on K8s like Spark etc.
  • Snappy Console
  • Python CLI
  • Written in Golang and optimized for performance

Coming Soon

  • Reactive pipelines
  • Golang CLI
  • Grafana templates (user/system observability)
  • Arbitrary flytekit less container support
  • More integrations

Available Plugins

  • Containers
  • K8s Pods
  • AWS Batch Arrays
  • K8s Pod arrays
  • K8s Spark (native pyspark and java/scala)
  • Qubole Hive
  • Presto Queries
  • Pytorch Operator

Coming soon

  • Sagemaker
  • Flink-K8s

Current Usage

  • Lyft Rideshare
  • Lyft L5 autonomous
  • Juno

Changelogs

Changelogs

Biweekly Community Sync

  • Starting April 21 2020, the Flyte community meets every other Tuesday at 9:00 AM PST (US West coast time).
  • You can join the google meet.
  • Meeting notes are captured in Doc
  • Demo Signup Sheet

Component Repos

Repo Language Purpose
flyte Kustomize,RST deployment, documentation, issues
flyteidl Protobuf interface definitions
flytepropeller Go execution engine
flyteadmin Go control plane
flytekit Python python SDK and tools
flyteconsole Typescript admin console
datacatalog Go manage input & output artifacts
flyteplugins Go flyte plugins
flytestdlib Go standard library
flytesnacks Python examples, tips, and tricks

Production K8s Operators

Repo Language Purpose
Spark Go Apache Spark batch
Flink Go Apache Flink streaming

flyte's People

Contributors

katrogan avatar wild-endeavor avatar georgesnelling avatar dschaller avatar enghabu avatar jonathanburns avatar honnix avatar schottra avatar igorvalko avatar akhurana001 avatar migueltol22 avatar hoyajigi avatar adinin avatar kinow avatar bnsblue avatar lu4nm3 avatar rubenbarragan avatar ilikedata avatar varshaparthay avatar moose007 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.