Git Product home page Git Product logo

gokart's Introduction

gokart

Test Python Versions

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.

Documentation for the latest release is hosted on readthedocs.

About gokart

Here are some good things about gokart.

  • The following meta data for each Task is stored separately in a pkl file with hash value
    • task output data
    • imported all module versions
    • task processing time
    • random seed in task
    • displayed log
    • all parameters set as class variables in the task
  • Automatically rerun the pipeline if parameters of Tasks are changed.
  • Support GCS and S3 as a data store for intermediate results of Tasks in the pipeline.
  • The above output is exchanged between tasks as an intermediate file, which is memory-friendly
  • pandas.DataFrame type and column checking during I/O
  • Directory structure of saved files is automatically determined from structure of script
  • Seeds for numpy and random are automatically fixed
  • Can code while adhering to SOLID principles as much as possible
  • Tasks are locked via redis even if they run in parallel

All the functions above are created for constructing Machine Learning batches. Provides an excellent environment for reproducibility and team development.

Here are some non-goal / downside of the gokart.

  • Batch execution in parallel is supported, but parallel and concurrent execution of task in memory.
  • Gokart is focused on reproducibility. So, I/O and capacity of data storage can become a bottleneck.
  • No support for task visualize.
  • Gokart is not an experiment management tool. The management of the execution result is cut out as Thunderbolt.
  • Gokart does not recommend writing pipelines in toml, yaml, json, and more. Gokart is preferring to write them in Python.

Getting Started

Within the activated Python environment, use the following command to install gokart.

pip install gokart

Quickstart

A minimal gokart tasks looks something like this:

import gokart

class Example(gokart.TaskOnKart):
    def run(self):
        self.dump('Hello, world!')

task = Example()
output = gokart.build(task)
print(output)

gokart.build return the result of dump by gokart.TaskOnKart. The example will output the following.

Hello, world!

This is an introduction to some of the gokart. There are still more useful features.

Please See Documentation .

Have a good gokart life.

Achievements

Gokart is a proven product.

Thanks

gokart is a wrapper for luigi. Thanks to luigi and dependent projects!

gokart's People

Contributors

vaaaaanquish avatar nishiba avatar mski-iksm avatar hirosassa avatar hi-king avatar dependabot[bot] avatar dasoran avatar kitagry avatar kuri8ive avatar ujiuji1259 avatar e-mon avatar swen128 avatar tayleruva avatar yamasakih avatar yuta100101 avatar yokomotod avatar tkda-h3 avatar hirotosuzuki avatar snowhork avatar argonism avatar maronuu avatar yukinagae avatar enokid avatar ma2gedev avatar 5n7 avatar saya-kawakami avatar ryusuketa avatar dn070017 avatar pn11 avatar mamo3gr avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.