Git Product home page Git Product logo

xgboost-tuner's Introduction

xgboost-tuner is a Python library for automating the tuning of XGBoost parameters.

Due to XGBoost's large number of parameters and the size of their possible parameter spaces, doing an ordinary GridSearch over all of them isn't computationally feasible.

The excellent article Complete Guide to Parameter Tuning in XGBoost offers an alternative approach to tuning XGBoost by tuning parameters incrementally.

This library offers two strategies to automate this tuning - an incremental approach as laid out in the article above and an alternative approach using a more computationally efficient randomized search.

In both strategies, the user can configure the parameter space of interest through keyword arguments.

Installing xgboost-tuner

PyPI

To install xgboost-tuner, execute

pip install xgboost-tuner  

Alternatively, you could download the package manually from the Python Package Index https://pypi.python.org/pypi/xgboost-tuner, unzip it, navigate into the package, and use the command:

python setup.py install

Examples

Tuning XGBoost parameters through an incremental grid search

from sklearn.datasets import load_svmlight_file
from xgboost_tuner.tuner import tune_xgb_params

train, label = load_svmlight_file('data/agaricus.txt.train')
train = train.toarray()

# Tune the parameters incrementally and limit the range for colsample_bytree and subsample
best_params, history = tune_xgb_params(
    cv_folds=3,
    label=label,
    metric_sklearn='accuracy',
    metric_xgb='error',
    n_jobs=4,
    objective='binary:logistic',
    random_state=2017,
    strategy='incremental',
    train=train,
    colsample_bytree_min=0.8,
    colsample_bytree_max=1.0,
    subsample_min=0.8,
    subsample_max=1.0
)

Tuning XGBoost parameters through randomized search

from sklearn.datasets import load_svmlight_file
from xgboost_tuner.tuner import tune_xgb_params

train, label = load_svmlight_file('data/agaricus.txt.train')
train = train.toarray()

# Tune the parameters in a randomized fashion and control the distributions for colsample_bytree and subsample
best_params, history = tune_xgb_params(
    cv_folds=3,
    label=label,
    metric_sklearn='accuracy',
    metric_xgb='error',
    n_jobs=4,
    objective='binary:logistic',
    random_state=2017,
    strategy='randomized',
    train=train,
    colsample_bytree_loc=0.5,
    colsample_bytree_scale=0.2,
    subsample_loc=0.5,
    subsample_scale=0.2
)

xgboost-tuner's People

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.