MDPs with Non-Deterministic Policies (M70) Mahdi Milani Fard, Joelle Pineau School of Computer Science, McGill University Deterministic/Sto chastic Policy: state - policy - (prob. over) action Non-deterministic p olicy: state - policy - {as ,1 , as ,2 , . . . } - agent's choice - as ,i agent's choice is non-deterministic MDP-Based Decision Supp ort Systems: Provide "choice" with performance guarantees Main contribution: 2 algorithms to optimize non-deterministic policies Algorithm 1 formulates problem as Mixed Integer Programming (exact) Algorithm 2 formulates problem as Heuristic Search (fast, approximate) Empirical evaluation was done on synthetic data and medical decision-making domain