Git Product home page Git Product logo

pipertool's Introduction

Piper logo

WebsiteDocsChat (Community & Support)Tutorials

Contributors License Docs Build status Contact Blog

Piper is an open-source platform for data science and machine learning prototyping. Concentrate only on your goals. Key features:

  1. Simple python contexts experience. Helps to create and deploy pipelines. Does not depend on any proprietary online services.
  2. Connect each module into a pipeline. Run it via docker or virtual environment. Then build whole infrastructure by using venv, Docker or Cloud.
  3. Decreases routine and repetitive tasks. Speed up process from idea to production.
  4. Well-tested and reproducible. Easily extendable by your own Executor.

Piper aims to help data-scientists and machine-learning developers to create and build full infrastructure for their projects.

Contents

How Piper works =============

Quick start

Quick start pipertool package compose env ===========

In root directory project run command in terminal

  • sudo -u root /bin/bash
  • create and activate venv
  • pip install -r requirements.txt
  • in configuration.py rename for correctly path for new directory
  • python setup.py install
  • piper --env-type compose start
  • 0.0.0.0:7585 - FastApi
  • 0.0.0.0:9001 - Milvus Console (minioadmin/minioadmin)
  • piper --env-type compose stop
  • pip uninstall piper

Quick start pipertool package compose env ===========

In root directory project run command in terminal

  • sudo -u root /bin/bash
  • create and activate venv
  • pip install -r requirements.txt
  • in configuration.py rename for correctly path for new directory
  • python main.py
  • await click CTRL+C from compose env

Installation

pip (PyPI)

Comparison to related technologies

  1. Jupyter - is the de facto experimental environment for most data scientists. However, it is desirable to write experimental code.
  2. Data Engineering tools such as AirFlow or Luigi - These are very popular ML pipeline build tools. Airflow can be connected to a kubernetes cluster or collect tasks through a simple PythonOperator. The downside is that their functionality is generally limited on this, that is, they do not provide ML modules out of the box. Moreover, all developments will still have to be wrapped in a scheduler and this is not always a trivial task. However, we like them and we use Airflow and Luigi as possible context for executors.
  3. Azure ML / Amazon SageMaker / Google Cloud - Cloud platforms really allow you to assemble an entire system from ready-made modules and put it into operation relatively quickly. Of the minuses: high cost, binding to a specific cloud, as well as small customization for specific business needs. For a large business, this is the most logical option - to build an ML infrastructure in the cloud. We also maintain cloud options as posible ways for the deployment step.
  4. DataRobot/Baseten - They offer an interesting, but small set of ready-made modules. However, in Baseten, all integration is implied in the kubernetes cluster. This is not always convenient and necessary for Proof-of-Concept. Piper also provides an open-source framework in which you can build a truly customized pipeline from many modules. Basically, such companies either do not provide an open-source framework, or provide a very truncated set of modules for experiments, which limits the freedom, functionality, and applicability of these platforms. This is partly similar to the hub of models and datasets in huggingface.
  5. Mlflow / DVC - There are also many excellent projects on the market for tracking experiments, serving and storing machine learning models. But they are increasingly utilitarian and do not directly help in the task of accelerating the construction of a machine learning MVP project. We plan to add integrations to Piper with the most popular frameworks for the needs of DS and ML specialists.

Contributing

Contributions are welcome! Please see our Contributing Guide for more details. Thanks to all our contributors!

Contributors

Mailing List

Copyright

This project is distributed under the Apache license version 2.0 (see the LICENSE file in the project root).

By submitting a pull request to this project, you agree to license your contribution under the Apache license version 2.0 to this project.

pipertool's People

Contributors

amprix avatar artemsmeta avatar georgekontsevik avatar laogunz avatar sokolegg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

georgekontsevik

pipertool's Issues

redundant code in setup.py

I tried to understand some code in this project and one line seems strange:
dependency_links = [x.strip().replace('git+', '') for x in all_reqs \ if 'git+' not in x]
I think a part x.strip().replace('git+', '') is redundant and must be replaced by x.strip()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.