BanditLib: a simple multi-armed bandit library: multiplay ver.
.-. .-') ('-. .-') _ _ .-') _ .-') _ .-. .-')
\ ( OO ) ( OO ).-. ( OO ) )( ( OO) ) ( OO) ) \ ( OO )
;-----.\ / . --. /,--./ ,--,' \ .'_ ,-.-') / '._ ,--. ,-.-') ;-----.\
| .-. | | \-. \ | \ | |\ ,`'--..._) | |OO)|'--...__)| |.-') | |OO)| .-. |
| '-' /_).-'-' | || \| | ) | | \ ' | | \'--. .--'| | OO ) | | \| '-' /_)
| .-. `. \| |_.' || . |/ | | ' | | |(_/ | | | |`-' | | |(_/| .-. `.
| | \ | | .-. || |\ | | | / : ,| |_.' | | (| '---.',| |_.'| | \ |
| '--' / | | | || | \ | | '--' /(_| | | | | |(_| | | '--' /
`------' `--' `--'`--' `--' `-------' `--' `--' `------' `--' `------'
_ .-') _ (`-. (`-. ('-. _ .-') .-') .-') _
( '.( OO )_ ( (OO ) _(OO )_ _( OO)( \( -O ) ( OO ). ( OO ) )
,--. ,--.)_.` \ ,--(_/ ,. \(,------.,------. (_)---\_) ,-.-') .-'),-----. ,--./ ,--,'
| `.' |(__...--'' \ \ /(__/ | .---'| /`. '/ _ | | |OO)( OO' .-. '| \ | |\
| | | / | | \ \ / / | | | / | |\ :` `. | | \/ | | | || \| | )
| |'.'| | | |_.' | \ ' /, (| '--. | |_.' | '..`''.) | |(_/\_) | |\| || . |/
| | | | | .___.' \ /__) | .--' | . '.'.-._) \ ,| |_.' \ | | | || |\ |
| | | | | | \ / | `---.| |\ \ \ /(_| | `' '-' '| | \ |
`--' `--' `--' `-' `------'`--' '--' `-----' `--' `-----' `--' `--'
1. About
2. Environment
3. Quick run
4. Misc
This is a C++ package for multi-armed bandit simulations. This package is a multi-play version of the banditlib (https://github.com/jkomiyama/banditlib)
- Arms:
- Arms of binary rewards are implemented.
- Policies:
- Multiplay Thompson sampling (MP-TS) [1]
- Multiplay KL-UCB (MP-KL-UCB) [2]
- CUCB [3]
- Exp3.M [4]
This program supports a linux/GNU C++ environment. We do not check the compatibility with windows/MacOSX.
More formally, this program depends on:
- C++0x: modern C++ compiler (preferably GNU C++ (g++))
- waf (included) [5]: build script
- cmdline.h (included) [6]: command line parser
Type
./compile
./build/main -r 10
to run 10 times. The result of the runs will be written in out/example1.txt
This package also includes a simple plot tool (simpleplot.py) that is dependent on Python/Matplotlib. If your environment is g++/Python ready, try
./example.sh
The implementation of the beta distribution sampler is from [7]. The logo was generated by using [8].
##References [1] J. Komiyama, J. Honda, and H.Nakagawa: Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays. ICML 2015 [2] Aurélien Garivier and Olivier Cappé: The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond. COLT 2011, 359-376 [3] Wei Chen, Yajun Wang, and Yang Yuan: Combinatorial Multi-Armed Bandit: General Framework and Applications. ICML 2013, 151-159 [4] Taishi Uchiya, Atsuyoshi Nakamura, and Mineichi Kudo: Algorithms for Adversarial Bandit Problems with Multiple Plays. ALT 2010, 375-389 [5] Waf - The meta build system: https://code.google.com/p/waf/ [6] Hideyuki Tanaka: cmdline https://github.com/tanakh/cmdline [7] Joseph Mansfield: A comment on stackoverflow http://stackoverflow.com/questions/15165202/random-number-generator-with-beta-distribution [8] Text to Ascii Art Maker: http://patorjk.com/software/taag/
##Author Junpei Komiyama (junpei.komiyama atmark gmail.com)
This software is released under the MIT License, see LICENSE.txt.
Please acknowledge its use with a citation:
- Junpei Komiyama, Junya Honda, Hiroshi Nakagawa. Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays. In Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), pages 1152-1161, July 2015