Skip to main content

Talk Title: Data-Efficient Optimization in Reinforcement Learning

Watch Pan’s Research Lightning Talk

Talk Abstract:Optimization lies at the heart of modern machine learning and data science research. How to design data-efficient optimization algorithms that have a low sample complexity while enjoying a fast convergence at the same time has remained a challenging but imperative topic in machine learning. My research aims to answer this question from two facets: providing the theoretical analysis and understanding of optimization algorithms; and developing new algorithms with strong empirical performance in a principled way. In this talk, I will introduce our recent work in developing and improving data-efficient optimization algorithms for decision-making (reinforcement learning) problems. In particular, I will introduce the variance reduction technique in optimization and show how it can improve the data efficiency of policy gradient methods in reinforcement learning. I will present the variance reduced policy gradient algorithm, which constructs an unbiased policy gradient estimator for the value function. I will show that it provably reduces the sample complexity of vanilla policy gradient methods such as REINFORCE and GPOMDP.

Bio: Pan Xu is a Ph.D. candidate in the Department of Computer Science at the University of California, Los Angeles. His research spans the areas of machine learning, data science, and optimization, with a focus on the development and improvement of large-scale nonconvex optimization algorithms for machine learning and data science applications. Pan obtained his B.S. degree in mathematics from the University of Science and Technology of China. Pan received the Presidential Fellowship in Data Science from the University of Virginia. He has published over 20 high-quality papers on top machine learning conferences and journals such as ICML, NeurIPS, ICLR, AISTATS, and JMLR.