Optimization on a Budget: A Reinforcement Learning Approach Poster ID: T13 Paul Ruvolo1, Ian Fasel2, Javier Movellan1 UC San Diego1, UT Austin2 The Levenberg-Marquardt Algorithm employs a heuristic that adaptively controls the blending of gradient descent and Gauss-Newton. Intuition: heuristic controllers may be able to be improved using reinforcement learning by - tuning for real-time constraints - tuning for particular optimization domains Performance on Smile Detection Optimization Progress Point to Evaluate Action Value Function Reinforcement Learning Optimization Controller Function Being Optimized Area under the ROC for a Validation Set 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 10 Learned Method Default Levenberg!Marquardt Our reinforcement learning method achieves large gains in performance Gains are realized in a wide range of optimization tasks 20 30 40 50 60 Optimization Budget