Séminaires REPARTI

Les Séminaires REPARTI à l'Université Laval ont lieu le vendredi à 11h30.
Veuillez consulter le programme pour plus de détails.







Oct 8 2010 11:30AM

Hamid Chinaei
Laboratoire DAMAS
Département d'informatique et de génie logiciel, U. Laval

Inverse Reinforcement Learning for Dialogue Management


Reinforcement Learning (RL) is a learning and planning method which can be used as a formal optimization framework for minimizing a cost function. In particular, it can be used in dialogue management for learning dialogue plans where we deal with the problem of curse of dimensionality in dialog situations. As a result, also the learned dialogue plans are more detailed compared to the hand-crafted ones.

However, the numerical assignment of the costs in RL framework is itself hand-crafted. As such, Inverse Reinforcement Learning (IRL) tries to first learn the cost function based on expert behaviour. That is, the cost function which the expert designers implicitly tried to optimize, and is inherent in the dialog trajectories. In this presentation, I briefly go through reinforcement learning for dialogue management followed by what Inverse Reinforcement Learning is. Then, I describe two IRL algorithms: the first one uses a complete 'oracle' model of the user and system, and the second is designed to use only logs of users interacting with a deployed system.. Finally, I explain the experiments we did on toy problems followed by discussion at the end.

