|
Publications |
|
Improving the Pareto UCB1 Algorithm on the Multi-Objective Multi-Armed BanditAbstract - In this work, we introduce a straightforward approach for bounding the regret of Multi-Objective Multi-Armed Bandit (MO-MAB) heuristics extended from standard bandit algorithms. The proposed methodology allows us to easily build upon the regret analysis of the heuristics in the standard bandit setting. Using our approach, we improve the Pareto UCB1 algorithm, that is the multi-objective extension of the seminal UCB1, by performing a tighter regret analysis. The resulting Pareto UCB1* also has the advantage of being empirically usable without any approximation. Bibtex:
@inproceedings{Durand1084, Dernière modification: 2014/12/26 par cgagne |
|||
©2002-. Laboratoire de Vision et Systèmes Numériques. Tous droits réservés |