A Survey Analyzing Generalization in Deep Reinforcement Learning

Reinforcement learning research obtained significant success and attention with the utilization of deep neural networks to solve problems in high dimensional state or action spaces. While deep reinforcement learning policies are currently being deployed in many different fields from medical applications to self-driving vehicles, there are still ongoing questions the field is trying to answer on the generalization capabilities of deep reinforcement learning policies. In this paper, we will outline the fundamental reasons why deep reinforcement learning policies encounter overfitting problems that limit their robustness and generalization capabilities. Furthermore, we will formalize and unify the diverse solution approaches to increase generalization, and overcome overfitting in state-action value functions. We believe our study can provide a compact systematic unified analysis for the current advancements in deep reinforcement learning, and help to construct robust deep neural policies with improved generalization abilities.

This repository contains the outline of the survey with relevant links. See the paper here for more details.

@article{korkmazrlsurvey24,
    title={A Survey Analyzing Generalization in Deep Reinforcement Learning},
    author={Ezgi Korkmaz},
    journal={https://arxiv.org/pdf/2401.02349.pdf},
    year={2024}
}

1. How to Achieve Generalization?

See the paper for a formal definition of generalization in deep reinforcement learning.

2. Roots of Overestimation in Deep Reinforcement Learning

Issues in using function approximation for reinforcement learning, 1993. [Paper]
Generalization in Reinforcement Learning : Safely approximating the value function, NeurIPS 1994. [Paper]
Double Q-learning, NeurIPS 2010. [Paper]
Deep reinforcement learning with double Q-learning, AAAI 2016. [Paper]

3. The Role of Exploration in Overfitting

Unifying count-based exploration and intrinsic motivation. [Paper]
Generalization and exploration via randomized value functions, ICML 2016. [Paper]
Deep exploration via bootstrapped DQN, NeurIPS 2016. [Paper]
VIME: variational information maximizing exploration, ICML 2017. [Paper]
Noisy networks for exploration, ICLR 2018. [Paper]
SUNRISE: A simple unified framework for ensemble learning in deep reinforcement learning, ICML 2021. [Paper]

4. Regularization

4.1. Data Augmentation

Improving generalization in reinforcement learning with mixture regularization, NeurIPS 2020. [Paper]
CURL: contrastive unsupervised representations for reinforcement learning, ICML 2020. [Paper]
Reinforcement learning with augmented data, NeurIPS 2020. [Paper]
Improving generalization in RL with mixture regularization, NeurIPS 2020. [Paper]

4.2. Direct Function Regularization

Munchausen Reinforcement Learning, NeurIPS 2020. [Paper]
Network randomization: A simple technique for generalization in deep reinforcement learning, ICLR 2020. [Paper]
Discount factor as a regularizer in Reinforcement Learning, ICML 2020. [Paper]
Contrastive behavioral similarity embeddings for generalization in reinforcement learning, ICLR 2021. [Paper]
Regularization matters in policy optimization an empirical study on continuous control, ICLR 2021. [Paper]

4.3. The Adversarial Perspective for Deep Reinforcement Learning Generalization

Adversarial Attacks on Neural Network Policies. ICLR Workshop 2017. [Paper]
Robust Adversarial Reinforcement Learning. ICML 2017. [Paper]
Adversarial Policies: Attacking Deep Reinforcement Learning. ICLR 2020. [Paper]
Investigating Vulnerabilities of Deep Neural Policies. UAI 2021. [Paper]
Deep Reinforcement Learning Policies Learn Shared Adversarial Features Across MDPs. AAAI 2022. [Paper]
Adversarial Robust Deep Reinforcement Learning Requires Redefining Robustness. AAAI 2023. [Paper]
Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions. ICML 2023. [Paper]

5. Meta-Reinforcement Learning and Meta Gradients

Discovering reinforcement learning algorithms, NeurIPS 2020.
Meta-gradient reinforcement learning with an objective discovered online, NeurIPS 2020.
Discovery of options via meta-learned subgoals, NeurIPS 2021.
Introducing symmetries to black box meta reinforcement learning, AAAI 2022.

6. Lifelong Reinforcement Learning

A meta-mdp approach to exploration for lifelong reinforcement learning, NeurIPS 2019.
Lifelong learning with a changing action set, AAAI 2020.
Lipschitz lifelong Reinforcement Learning, AAAI 2021.

7. Inverse Reinforcement Learning

Algorithms for inverse reinforcement learning, ICML 2000.
IQ-learn: Inverse soft-Q learning for imitation, NeurIPS 2021.

ezgikorkmaz / generalization-reinforcement-learning Goto Github PK

generalization-reinforcement-learning's Introduction

A Survey Analyzing Generalization in Deep Reinforcement Learning

1. How to Achieve Generalization?

2. Roots of Overestimation in Deep Reinforcement Learning

3. The Role of Exploration in Overfitting

4. Regularization

4.1. Data Augmentation

4.2. Direct Function Regularization

4.3. The Adversarial Perspective for Deep Reinforcement Learning Generalization

5. Meta-Reinforcement Learning and Meta Gradients

6. Lifelong Reinforcement Learning

7. Inverse Reinforcement Learning

generalization-reinforcement-learning's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent