Bayesian design principles for frequentist sequential learning
We develop a general theory to optimise the frequentist regret for sequential learning problems, where efficient bandit and reinforcement learning algorithms can be derived from unified Bayesian principles. We propose a novel optimisation approach to generate “algorithmic beliefs” at each round, and use Bayesian posteriors to make decisions. The optimisation objective to create “algorithmic beliefs,” which we term “Algorithmic Information Ratio,” represents an intrinsic complexity measure that effectively characterizes the frequentist regret of any algorithm. To the best of our knowledge, this is the first systematical approach to make Bayesian-type algorithms prior-free and applicable to adversarial settings, in a generic and optimal manner. Moreover, the algorithms are simple and often efficient to implement.
Papers
https://arxiv.org/abs/2310.00806
Speaker’s profile
Yunbei Xu recently joined the Department of Industrial Systems Engineering and Management as an Assistant Professor under the NUS Presidential Young Professorship. He holds a BSc in Pure Mathematics from Peking University, earned his PhD from Columbia Business School, advised by Assaf Zeevi, and completed his postdoctoral training at the MIT College of Computing, advised by Alexander Rakhlin. Yunbei’s research interests include the mathematics of AI, complex systems, and physics. His contributions have been recognized with the ICML Outstanding Paper Award, First Place in the INFORMS George Nicholson Student Paper Competition, and twice as a Finalist for the Applied Probability Society Best Student Paper Award.
For more information about the ESD Seminar, please email esd_invite@sutd.edu.sg