Stochastic Recursive Algorithms: A Markov Chain Perspective
Virtual Informal Systems Seminar (VISS) Centre for Intelligent Machines (CIM) and Groupe d'Etudes et de Recherche en Analyse des Decisions (GERAD)
Abhishek Gupta
Electrical and Computer Engineering , The Ohio State University
Abstract:
Many stochastic optimization and empirical dynamic programming algorithms have been proposed in the literature that approximates certain deterministic algorithms. Examples of such algorithms are stochastic gradient descent and empirical value iteration, empirical Q value iteration, etc. for discounted or average cost MDPs. We refer to them as stochastic recursive algorithms, in which an exact contraction operator is replaced with an approximate random operator at every step of the iteration. These algorithms can be viewed within the framework of iterated random maps, and thus Markov chain theory can be leveraged to study the convergence properties of these algorithms. In the talk, we will discover some new insights about the convergence properties of stochastic recursive algorithms over infinite dimensional spaces. We will also present applications to some reinforcement learning algorithms
Bio:
Abhishek Gupta is an assistant professor in the ECE department at The Ohio State University. He completed his PhD in Aerospace Engineering from UIUC in 2014. His research interests are in stochastic control theory, probability theory, and game theory with applications to transportation markets, electricity markets, and cybersecurity of control systems.