Git Product home page Git Product logo

multiplaybanditlib's Introduction

BanditLib: a simple multi-armed bandit library: multiplay ver.

.-. .-')     ('-.         .-') _  _ .-') _            .-') _                   .-. .-')   
\  ( OO )   ( OO ).-.    ( OO ) )( (  OO) )          (  OO) )                  \  ( OO )  
;-----.\   / . --. /,--./ ,--,'  \     .'_   ,-.-') /     '._ ,--.      ,-.-') ;-----.\  
| .-.  |   | \-.  \ |   \ |  |\  ,`'--..._)  |  |OO)|'--...__)|  |.-')  |  |OO)| .-.  |  
| '-' /_).-'-'  |  ||    \|  | ) |  |  \  '  |  |  \'--.  .--'|  | OO ) |  |  \| '-' /_) 
| .-. `.  \| |_.'  ||  .     |/  |  |   ' |  |  |(_/   |  |   |  |`-' | |  |(_/| .-. `.  
| |  \  |  |  .-.  ||  |\    |   |  |   / : ,|  |_.'   |  |  (|  '---.',|  |_.'| |  \  | 
| '--'  /  |  | |  ||  | \   |   |  '--'  /(_|  |      |  |   |      |(_|  |   | '--'  / 
`------'   `--' `--'`--'  `--'   `-------'   `--'      `--'   `------'  `--'   `------'  

 _   .-')      _ (`-.             (`-.      ('-.  _  .-')    .-')                              .-') _  
( '.( OO )_   ( (OO  )          _(OO  )_  _(  OO)( \( -O )  ( OO ).                           ( OO ) ) 
 ,--.   ,--.)_.`     \      ,--(_/   ,. \(,------.,------. (_)---\_)  ,-.-')  .-'),-----. ,--./ ,--,'  
 |   `.'   |(__...--''      \   \   /(__/ |  .---'|   /`. '/    _ |   |  |OO)( OO'  .-.  '|   \ |  |\  
 |         | |  /  | |       \   \ /   /  |  |    |  /  | |\  :` `.   |  |  \/   |  | |  ||    \|  | ) 
 |  |'.'|  | |  |_.' |        \   '   /, (|  '--. |  |_.' | '..`''.)  |  |(_/\_) |  |\|  ||  .     |/  
 |  |   |  | |  .___.'         \     /__) |  .--' |  .  '.'.-._)   \ ,|  |_.'  \ |  | |  ||  |\    |   
 |  |   |  | |  |               \   /     |  `---.|  |\  \ \       /(_|  |      `'  '-'  '|  | \   |   
 `--'   `--' `--'                `-'      `------'`--' '--' `-----'   `--'        `-----' `--'  `--' 

1. About
2. Environment
3. Quick run
4. Misc


1. About

This is a C++ package for multi-armed bandit simulations. This package is a multi-play version of the banditlib (https://github.com/jkomiyama/banditlib)

  • Arms:
  • Arms of binary rewards are implemented.
  • Policies:
  • Multiplay Thompson sampling (MP-TS) [1]
  • Multiplay KL-UCB (MP-KL-UCB) [2]
  • CUCB [3]
  • Exp3.M [4]

2. Environment

This program supports a linux/GNU C++ environment. We do not check the compatibility with windows/MacOSX.

More formally, this program depends on:

  • C++0x: modern C++ compiler (preferably GNU C++ (g++))
  • waf (included) [5]: build script
  • cmdline.h (included) [6]: command line parser

3. Quick run

Type

./compile
./build/main -r 10

to run 10 times. The result of the runs will be written in out/example1.txt

This package also includes a simple plot tool (simpleplot.py) that is dependent on Python/Matplotlib. If your environment is g++/Python ready, try

./example.sh

4. Misc

The implementation of the beta distribution sampler is from [7]. The logo was generated by using [8].

##References [1] J. Komiyama, J. Honda, and H.Nakagawa: Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays. ICML 2015 [2] Aurélien Garivier and Olivier Cappé: The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond. COLT 2011, 359-376 [3] Wei Chen, Yajun Wang, and Yang Yuan: Combinatorial Multi-Armed Bandit: General Framework and Applications. ICML 2013, 151-159 [4] Taishi Uchiya, Atsuyoshi Nakamura, and Mineichi Kudo: Algorithms for Adversarial Bandit Problems with Multiple Plays. ALT 2010, 375-389 [5] Waf - The meta build system: https://code.google.com/p/waf/ [6] Hideyuki Tanaka: cmdline https://github.com/tanakh/cmdline [7] Joseph Mansfield: A comment on stackoverflow http://stackoverflow.com/questions/15165202/random-number-generator-with-beta-distribution [8] Text to Ascii Art Maker: http://patorjk.com/software/taag/

##Author Junpei Komiyama (junpei.komiyama atmark gmail.com)

This software is released under the MIT License, see LICENSE.txt.

Please acknowledge its use with a citation:

  • Junpei Komiyama, Junya Honda, Hiroshi Nakagawa. Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays. In Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), pages 1152-1161, July 2015

multiplaybanditlib's People

Contributors

jkomiyama avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.