Optimization on a Budget: A Reinforcement Learning Approach Poster ID: T13
Paul Ruvolo1, Ian Fasel2, Javier Movellan1 UC San Diego1, UT Austin2

The Levenberg-Marquardt Algorithm employs a heuristic that adaptively controls the blending of gradient descent and Gauss-Newton. Intuition: heuristic controllers may be able to be improved using reinforcement learning by - tuning for real-time constraints - tuning for particular optimization domains
Performance on Smile Detection

Optimization Progress

Point to Evaluate Action Value Function

Reinforcement Learning Optimization Controller Function Being Optimized

Area under the ROC for a Validation Set

1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 10 Learned Method Default Levenberg!Marquardt

Our reinforcement learning method achieves large gains in performance Gains are realized in a wide range of optimization tasks

20

30

40

50

60

Optimization Budget