Git Product home page Git Product logo

orchest's Introduction


WebsiteDocsQuickstartSlack


Version badge 0.1.0

Orchest is a web based data science tool that works on top of your filesystem allowing you to use your editor of choice. With Orchest you get to focus on visually building and iterating on your pipeline ideas. Under the hood Orchest runs a collection of containers to provide a scalable platform that can run on your laptop as well as on a large scale cloud cluster.

Orchest lets you

  • Interactively build data science pipelines through its visual interface.
  • Automatically run your pipelines in parallel.
  • Develop your code in your favorite editor. Everything is filesystem based.
  • Tag the notebooks cells you want to skip when running a pipeline. Perfect for prototyping as you do not have to maintain a perfectly clean notebook.
  • Run experiments by parametrizing your pipeline. Easily try out all of your modeling ideas.

Table of contents

Installation

Requirements

  • Docker (tested on 19.03.9)

Linux/macOS/Windows(through WSL 2)

git clone https://github.com/orchest/orchest.git
cd orchest
./orchest.sh start

Note! on Windows Docker should be configured to use WSL 2. Make sure you clone inside the Linux environment. More info about Docker + WSL 2 can be found here: https://docs.docker.com/docker-for-windows/wsl/.

Quickstart

Please refer to our docs for a more comprehensive quickstart tutorial.

Build your pipeline.

Each pipeline step executes a file (.ipynb, .py, .R, .sh) in a containerized environment.

clip-1-cropped

Write your code.

Iteratively edit and run your code for each pipeline step with an interactive JupyterLab session.

clip-2-cropped

Run your pipeline and see the results come in.

Outputs (both stdout and stderr) are directly viewable and stored on disk.

clip-3-cropped

License

The software in this repository is licensed as follows:

  • All content residing under the "orchest-sdk/" directory of this repository is licensed under the "Apache-2.0" license as defined in "orchest-sdk/LICENSE".
  • Content outside of the above mentioned directory is available under the "AGPL-3.0" license.

Contributing

Contributions are more than welcome! Please see our contributer guides for more details.

We love your feedback

We would love to hear what you think and potentially add features based on your ideas. Come chat with us on Slack.

orchest's People

Contributors

yannickperrenet avatar ricklamers avatar samkovaly avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.