Git Product home page Git Product logo

nitishkthakur / automlib Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 1.0 371 KB

AutoML Python package with ML models with builtin Hyperparameter Optimization and easy to use API.

License: MIT License

Python 1.53% HTML 98.47%
ml machine learning machine-learning machine-learning-algorithms machinelearning automl automl-api optimization optimization-algorithms tune python python3 python-3 hyperparameter-optimization particle-swarm-optimization

automlib's Introduction

automlib

Python package with ML models with builtin Hyperparameter Optimization and easy to use API.

Hyperparameter optimization is one of the most important steps in any ML Pipeline. Often, is is accomplished using brute-force methods like Grid Search and Random Search. However, brute-force approaches have the following disadvantages:

  1. Its selection of the next hyperparameter settings to evaluate is not informed based on results it obtained from the previous settings.
  2. It takes a long amount of time to find a good solution

However, recently approaches using Bayesian Optimization, TPE have surfaced which are good solutions to the above mentioned problems. The approach in the present library uses Particle Swarm Optimization to optimize the ML models. Particle Swarm Optimization is an optimization procedure in which the trial solutions at each step are a function of the trial solutions at the previous step. Thus, the search progresses in an informed way - more search is performed in regions showing better settings.

However, instead of providing a separate function/method for optimization, in automlib, it has been incorporated as part of the model wrapper. So there is no need to perform optimization separately - all we have to do is call the fit method and automlib optimizes the model automatically.

Description

automlib currently contains the following model classes:

  • psoregressor
  • psoclassifier

Both the classes use a lightgbm Gradient Boosting model(https://lightgbm.readthedocs.io/en/latest/) and a particle swarm optimization api from pyswarm(https://pythonhosted.org/pyswarm/) wrapped inside the psoregressor and psoclassifier class to produce the models. Following are some of the key parameters to set in the model(reasonable defaults are already set but one can experiment with different settings):

  • population: Number of trial solutions to pursue at each iteration(default 30)
  • maxiter: Maximum number of iterations to consider before terminating the search(default 100)
  • minfunc: Minimum change in MSE(Mean Squared Error) of model before assuming convergence(default 1e-3).

Increasing population and maxiter will yield better quality models but will take longer to produce results. Increasing the population parameter is more likely to give good results hovever, than increasing maxiter.

Currently, the following parameters are being tuned:

  • n_estimators
  • max_depth
  • max_features
  • subsample
  • learning_rate
  • min_samples_leaf

How to Use

Initialize and fit model:

import automlib

# Fit automlib model
model = automlib.psoregressor(population = 20,  maxiter = 30)
model.fit(X = X_train, y = y_train)

Training Progress

Predict with the model

y_predicted = model.predict(X_test)

Please view 'automlib documentation for regression model.html' for more details.

automlib's People

Contributors

nitishkthakur avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

rnaimehaom

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.