This is a project for the lecture Graphs in Machine Learning, by Michal Valko.
It is developped by Charles Reizine and Élie Michel, under the supervision of Paul Weng.
Here is the problem description that we advocate in this project:
In inverse reinforcement learning (IRL), the goal is to learn a reward function that explains the observed demonstrations from a supposedly optimal policy. It is well-known that this problem is ill-posed and under-constrained. In practice, even though the reward function may not be known, some preferential information may be accessible: order over rewards, structural constraints, symmetry… Such information may also be useful in particular to detect that the demonstrated policy may not be optimal. The goal in this project is to study cases where this additional information would make IRL a better (if not a well) posed problem, propose solving algorithms and evaluate them experimentally.
We are currently working on the project report.
Python, with modules numpy
, sklearn
, matplotlib
.
-
A. Y. Ng. & S. Russel. Algorithms for Inverse Reinforcement Learning. ICML 2000
-
P. Abeel & A. Y. Ng. Apprenticeship Learning via Inverse Reinforcement Learning. ICML 2004
-
R. S. Sutton & A. G. Barto. Reinforcement Learning: An Introduction (draft). 2014-2016
-
D. Slater. Deep-Q learning Pong with Tensorflow and PyGame. Blog post
See report/bibliography.bib
for more resources.