文章基本信息

标题：Randomized allocation with arm elimination in a bandit problem with covariates
本地全文：下载
作者：Wei Qian ; Yuhong Yang
期刊名称：Electronic Journal of Statistics
印刷版ISSN：1935-7524
出版年度：2016
卷号：10
期号：1
页码：242-270
DOI：10.1214/15-EJS1104
语种：English
出版社：Institute of Mathematical Statistics
摘要：Motivated by applications in personalized web services and clinical research, we consider a multi-armed bandit problem in a setting where the mean reward of each arm is associated with some covariates. A multi-stage randomized allocation with arm elimination algorithm is proposed to combine the flexibility in reward function modeling and a theoretical guarantee of a cumulative regret minimax rate. When the function smoothness parameter is unknown, the algorithm is equipped with a histogram estimation based smoothness parameter selector using Lepski’s method, and is shown to maintain the regret minimax rate up to a logarithmic factor under a “self-similarity” condition.