Continual Learning Literature

This repository is maintained by Massimo Caccia and Timothée Lesort don't hesitate to send us an email to collaborate or fix some entries ({massimo.p.caccia , t.lesort} at gmail.com). The automation script of this repo is adapted from Automatic_Awesome_Bibliography.

For contributing to the repository please follow the process here

Outline

Classics

Catastrophic forgetting in connectionist networks , (1999) by French, Robert M. [bib]
Lifelong robot learning , (1995) by Thrun, Sebastian and Mitchell, Tom M [bib]

Argues knowledge transfer is essential if robots are to learn control with moderate learning times

Catastrophic interference in connectionist networks: The sequential learning problem , (1989) by McCloskey, Michael and Cohen, Neal J [bib]

Introduces CL and reveals the catastrophic forgetting problem

Surveys

GDumb: A Simple Approach that Questions Our Progress in Continual Learning , (2020) by Prabhu, Ameya, Torr, Philip HS and Dokania, Puneet K [bib]

introduces a super simple methods that outperforms almost all methods in all of the CL benchmarks. We need new better benchamrks

Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges , (2020) by Timothée Lesort, Vincenzo Lomonaco, Andrei Stoian, Davide Maltoni, David Filliat and Natalia Díaz-Rodríguez [bib]
Continual learning: A comparative study on how to defy forgetting in classification tasks , (2019) by Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Ales Leonardis, Gregory Slabaugh and Tinne Tuytelaars [bib]

Extensive empirical study of CL methods (in the multi-head setting)

Continual lifelong learning with neural networks: A review , (2019) by German I. Parisi, Ronald Kemker, Jose L. Part, Christopher Kanan and Stefan Wermter [bib]

An extensive review of CL

Three scenarios for continual learning , (2019) by van de Ven, Gido M and Tolias, Andreas S [bib]

An extensive review of CL methods in three different scenarios (task-, domain-, and class-incremental learning)

Born to learn: The inspiration, progress, and future of evolved plastic artificial neural networks , (2018) by Andrea Soltoggio, Kenneth O. Stanley and Sebastian Risi [bib]

Influentials

Efficient Lifelong Learning with A-GEM , (2019) by Chaudhry, Arslan, Ranzato, Marc’Aurelio, Rohrbach, Marcus and Elhoseiny, Mohamed [bib]

More efficient GEM; Introduces online continual learning

Towards Robust Evaluations of Continual Learning , (2018) by Farquhar, Sebastian and Gal, Yarin [bib]

Proposes desideratas and reexamines the evaluation protocol

Continual Learning in Practice , (2018) by Diethe, Tom, Borchert, Tom, Thereska, Eno, Pigem, Borja de Balle and Lawrence, Neil [bib]

Proposes a reference architecture for a continual learning system

Overcoming catastrophic forgetting in neural networks , (2017) by Kirkpatrick, James, Pascanu, Razvan, Rabinowitz, Neil, Veness, Joel, Desjardins, Guillaume, Rusu, Andrei A, Milan, Kieran, Quan, John, Ramalho, Tiago, Grabska-Barwinska, Agnieszka and others [bib]
Gradient Episodic Memory for Continual Learning , (2017) by Lopez-Paz, David and Ranzato, Marc-Aurelio [bib]

A model that alliviates CF via constrained optimization

Continual learning with deep generative replay , (2017) by Shin, Hanul, Lee, Jung Kwon, Kim, Jaehong and Kim, Jiwon [bib]

Introduces generative replay

An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks , (2013) by Goodfellow, I.~J., Mirza, M., Xiao, D., Courville, A. and Bengio, Y. [bib]

Investigates CF in neural networks

New Settings or Metrics

Wandering Within a World: Online Contextualized Few-Shot Learning , (2020) by Mengye Ren, Michael L. Iuzzolino, Michael C. Mozer and Richard S. Zemel [bib]

proposes a new continual few-shot setting where spacial and temporal context can be leveraged to and unseen classes need to be predicted

Defining Benchmarks for Continual Few-Shot Learning , (2020) by Antoniou, Antreas, Patacchiola, Massimiliano, Ochal, Mateusz and Storkey, Amos [bib]

(title is a good enough summary)

Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning , (2020) by Caccia, Massimo, Rodriguez, Pau, Ostapenko, Oleksiy, Normandin, Fabrice, Lin, Min, Caccia, Lucas, Laradji, Issam, Rish, Irina, Lacoste, Alexandre, Vazquez, David and others [bib]

Proposes a new approach to CL evaluation more aligned with real-life applications, bringing CL closer to Online Learning and Open-World learning

Compositional Language Continual Learning , (2020) by Yuanpeng Li, Liang Zhao, Kenneth Church and Mohamed Elhoseiny [bib]

method for compositional continual learning of sequence-to-sequence models

Regularization Methods

Continual Learning with Bayesian Neural Networks for Non-Stationary Data , (2020) by Richard Kurle, Botond Cseke, Alexej Klushyn, Patrick van der Smagt and Stephan Günnemann [bib]

continual learning for non-stationary data using Bayesian neural networks and memory-based online variational Bayes

Improving and Understanding Variational Continual Learning , (2019) by Siddharth Swaroop, Cuong V. Nguyen, Thang D. Bui and Richard E. Turner [bib]

Improved results and interpretation of VCL.

Uncertainty-based Continual Learning with Adaptive Regularization , (2019) by Ahn, Hongjoon, Cha, Sungmin, Lee, Donggyu and Moon, Taesup [bib]

Introduces VCL with uncertainty measured for neurons instead of weights.

Functional Regularisation for Continual Learning with Gaussian Processes , (2019) by Titsias, Michalis K, Schwarz, Jonathan, Matthews, Alexander G de G, Pascanu, Razvan and Teh, Yee Whye [bib]

functional regularisation for Continual Learning: avoids forgetting a previous task by constructing and memorising an approximate posterior belief over the underlying task-specific function

Task Agnostic Continual Learning Using Online Variational Bayes , (2018) by Chen Zeno, Itay Golan, Elad Hoffer and Daniel Soudry [bib]

Introduces an optimizer for CL that relies on closed form updates of mu and sigma of BNN; introduce label trick for class learning (single-head)

Overcoming Catastrophic Interference using Conceptor-Aided Backpropagation , (2018) by Xu He and Herbert Jaeger [bib]

Conceptor-Aided Backprop (CAB): gradients are shielded by conceptors against degradation of previously learned tasks

Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , (2018) by Chaudhry, Arslan, Dokania, Puneet K, Ajanthan, Thalaiyasingam and Torr, Philip HS [bib]

Formalizes the shortcomings of multi-head evaluation, as well as the importance of replay in single-head setup. Presenting an improved version of EWC.

Variational Continual Learning , (2018) by Cuong V. Nguyen, Yingzhen Li, Thang D. Bui and Richard E. Turner [bib]
Progress & compress: A scalable framework for continual learning , (2018) by Schwarz, Jonathan, Luketina, Jelena, Czarnecki, Wojciech M, Grabska-Barwinska, Agnieszka, Teh, Yee Whye, Pascanu, Razvan and Hadsell, Raia [bib]

A new P\&C architecture; online EWC for keeping the knowledge about the previous task, knowledge for keeping the knowledge about the current task (Multi-head setting, RL)

Facilitating Bayesian Continual Learning by Natural Gradients and Stein Gradients , (2018) by Chen, Yu, Diethe, Tom and Lawrence, Neil [bib]

Improves on VCL

Overcoming catastrophic forgetting in neural networks , (2017) by Kirkpatrick, James, Pascanu, Razvan, Rabinowitz, Neil, Veness, Joel, Desjardins, Guillaume, Rusu, Andrei A, Milan, Kieran, Quan, John, Ramalho, Tiago, Grabska-Barwinska, Agnieszka and others [bib]
Memory Aware Synapses: Learning what (not) to forget , (2017) by Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach and Tinne Tuytelaars [bib]

Importance of parameter measured based on their contribution to change in the learned prediction function

Continual Learning Through Synaptic Intelligence , (2017) by *Zenke, Friedeman, Poole, Ben and Ganguli, Surya * [bib]

Synaptic Intelligence (SI). Importance of parameter measured based on their contribution to change in the loss.

Distillation Methods

Overcoming Catastrophic Forgetting With Unlabeled Data in the Wild , (2019) by Lee, Kibok, Lee, Kimin, Shin, Jinwoo and Lee, Honglak [bib]

Introducing global distillation loss and balanced finetuning; leveraging unlabeled data in the open world setting (Single-head setting)

Large scale incremental learning , (2019) by Wu, Yue, Chen, Yinpeng, Wang, Lijuan, Ye, Yuancheng, Liu, Zicheng, Guo, Yandong and Fu, Yun [bib]

Introducing bias parameters to the last fully connected layer to resolve the data imbalance issue (Single-head setting)

Lifelong learning via progressive distillation and retrospection , (2018) by Hou, Saihui, Pan, Xinyu, Change Loy, Chen, Wang, Zilei and Lin, Dahua [bib]

Introducing an expert of the current task in the knowledge distillation method (Multi-head setting)

End-to-end incremental learning , (2018) by Castro, Francisco M, Marin-Jimenez, Manuel J, Guil, Nicolas, Schmid, Cordelia and Alahari, Karteek [bib]

Finetuning the last fully connected layer with a balanced dataset to resolve the data imbalance issue (Single-head setting)

Learning without forgetting , (2017) by Li, Zhizhong and Hoiem, Derek [bib]

Functional regularization through distillation (keeping the output of the updated network on the new data close to the output of the old network on the new data)

icarl: Incremental classifier and representation learning , (2017) by Rebuffi, Sylvestre-Alvise, Kolesnikov, Alexander, Sperl, Georg and Lampert, Christoph H [bib]

Binary cross-entropy loss for representation learning & exemplar memory (or coreset) for replay (Single-head setting)

Rehearsal Methods

Efficient Lifelong Learning with A-GEM , (2019) by Chaudhry, Arslan, Ranzato, Marc’Aurelio, Rohrbach, Marcus and Elhoseiny, Mohamed [bib]

More efficient GEM; Introduces online continual learning

Orthogonal Gradient Descent for Continual Learning , (2019) by Mehrdad Farajtabar, Navid Azizan, Alex Mott and Ang Li [bib]

projecting the gradients from new tasks onto a subspace in which the neural network output on previous task does not change and the projected gradient is still in a useful direction for learning the new task

Gradient based sample selection for online continual learning , (2019) by Aljundi, Rahaf, Lin, Min, Goujaud, Baptiste and Bengio, Yoshua [bib]

sample selection as a constraint reduction problem based on the constrained optimization view of continual learning

Online Continual Learning with Maximal Interfered Retrieval , (2019) by Aljundi, Rahaf, Caccia, Lucas, Belilovsky, Eugene, Caccia, Massimo, Lin, Min, Charlin, Laurent and Tuytelaars, Tinne [bib]

Controlled sampling of memories for replay to automatically rehearse on tasks currently undergoing the most forgetting

Online Learned Continual Compression with Adaptative Quantization Module , (2019) by Caccia, Lucas, Belilovsky, Eugene, Caccia, Massimo and Pineau, Joelle [bib]

Uses stacks of VQ-VAE modules to progressively compress the data stream, enabling better rehearsal

Experience replay for continual learning , (2019) by Rolnick, David, Ahuja, Arun, Schwarz, Jonathan, Lillicrap, Timothy and Wayne, Gregory [bib]
Gradient Episodic Memory for Continual Learning , (2017) by Lopez-Paz, David and Ranzato, Marc-Aurelio [bib]

A model that alliviates CF via constrained optimization

icarl: Incremental classifier and representation learning , (2017) by Rebuffi, Sylvestre-Alvise, Kolesnikov, Alexander, Sperl, Georg and Lampert, Christoph H [bib]

Binary cross-entropy loss for representation learning & exemplar memory (or coreset) for replay (Single-head setting)

Generative Replay Methods

Brain-Like Replay For Continual Learning With Artificial Neural Networks , (2020) by van de Ven, Gido M, Siegelmann, Hava T and Tolias, Andreas S [bib]
Learning to remember: A synaptic plasticity driven framework for continual learning , (2019) by Ostapenko, Oleksiy, Puscas, Mihai, Klein, Tassilo, Jahnichen, Patrick and Nabi, Moin [bib]

introdudes Dynamic generative memory (DGM) which relies on conditional generative adversarial networks with learnable connection plasticity realized with neural masking

Generative Models from the perspective of Continual Learning , (2019) by Lesort, Timoth{'e}e, Caselles-Dupr{'e}, Hugo, Garcia-Ortiz, Michael, Goudou, Jean-Fran{\c c}ois and Filliat, David [bib]

Extensive evaluation of CL methods for generative modeling

Marginal replay vs conditional replay for continual learning , (2019) by Lesort, Timoth{'e}e, Gepperth, Alexander, Stoian, Andrei and Filliat, David [bib]

Extensive evaluation of generative replay methods

Generative replay with feedback connections as a general strategy for continual learning , (2018) by Michiel van der Ven and Andreas S. Tolias [bib]

smarter Generative Replay

Continual learning with deep generative replay , (2017) by Shin, Hanul, Lee, Jung Kwon, Kim, Jaehong and Kim, Jiwon [bib]

Introduces generative replay

Dynamic Architectures or Routing Methods

ORACLE: Order Robust Adaptive Continual Learning , (2019) by Jaehong Yoon and Saehoon Kim and Eunho Yang and Sung Ju Hwang [bib]
Random Path Selection for Incremental Learning , (2019) by Jathushan Rajasegaran and Munawar Hayat and Salman H. Khan and Fahad Shahbaz Khan and Ling Shao [bib]

Proposes a random path selection algorithm, called RPSnet, that progressively chooses optimal paths for the new tasks while encouraging parameter sharing and reuse

Incremental Learning through Deep Adaptation , (2018) by Amir Rosenfeld and John K. Tsotsos [bib]
Continual Learning in Practice , (2018) by Diethe, Tom, Borchert, Tom, Thereska, Eno, Pigem, Borja de Balle and Lawrence, Neil [bib]

Proposes a reference architecture for a continual learning system

Progressive Neural Networks , (2016) by {Rusu}, A.~A., {Rabinowitz}, N.~C., {Desjardins}, G., {Soyer}, H., {Kirkpatrick}, J., {Kavukcuoglu}, K., {Pascanu}, R. and {Hadsell}, R. [bib]

Each task have a specific model connected to the previous ones

Hybrid Methods

Continual learning with hypernetworks , (2020) by Johannes von Oswald, Christian Henning, João Sacramento and Benjamin F. Grewe [bib]

Learning task-conditioned hypernetworks for continual learning as well as task embeddings; hypernetwors offers good model compression.

Compacting, Picking and Growing for Unforgetting Continual Learning , (2019) by Hung, Ching-Yi, Tu, Cheng-Hao, Wu, Cheng-En, Chen, Chien-Hung, Chan, Yi-Ming and Chen, Chu-Song [bib]

Approach leverages the principles of deep model compression, critical weights selection, and progressive networks expansion. All enforced in an iterative manner

Continual Few-Shot Learning

Wandering Within a World: Online Contextualized Few-Shot Learning , (2020) by Mengye Ren, Michael L. Iuzzolino, Michael C. Mozer and Richard S. Zemel [bib]

proposes a new continual few-shot setting where spacial and temporal context can be leveraged to and unseen classes need to be predicted

Defining Benchmarks for Continual Few-Shot Learning , (2020) by Antoniou, Antreas, Patacchiola, Massimiliano, Ochal, Mateusz and Storkey, Amos [bib]

(title is a good enough summary)

Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning , (2020) by Caccia, Massimo, Rodriguez, Pau, Ostapenko, Oleksiy, Normandin, Fabrice, Lin, Min, Caccia, Lucas, Laradji, Issam, Rish, Irina, Lacoste, Alexandre, Vazquez, David and others [bib]

Proposes a new approach to CL evaluation more aligned with real-life applications, bringing CL closer to Online Learning and Open-World learning

Learning from the Past: Continual Meta-Learning via Bayesian Graph Modeling , (2019) by Yadan Luo, Zi Huang, Zheng Zhang, Ziwei Wang, Mahsa Baktashmotlagh and Yang Yang [bib]
Online Meta-Learning , (2019) by Finn, Chelsea, Rajeswaran, Aravind, Kakade, Sham and Levine, Sergey [bib]

defines Online Meta-learning; propsoses Follow the Meta Leader (FTML) (~ Online MAML)

Reconciling meta-learning and continual learning with online mixtures of tasks , (2019) by Jerfel, Ghassen, Grant, Erin, Griffiths, Tom and Heller, Katherine A [bib]

Meta-learns a tasks structure; continual adaptation via non-parametric prior

Deep Online Learning Via Meta-Learning: Continual Adaptation for Model-Based RL , (2019) by Anusha Nagabandi, Chelsea Finn and Sergey Levine [bib]

Formulates an online learning procedure that uses SGD to update model parameters, and an EM with a Chinese restaurant process prior to develop and maintain a mixture of models to handle non-stationary task distribution

Task Agnostic Continual Learning via Meta Learning , (2019) by Xu He, Jakub Sygnowski, Alexandre Galashov, Andrei A. Rusu, Yee Whye Teh and Razvan Pascanu [bib]

Introduces What & How framework; enables Task Agnostic CL with meta learned task inference

Meta-Continual Learning

La-MAML: Look-ahead Meta Learning for Continual Learning , (2020) by Gunshi Gupta, Karmesh Yadav and Liam Paull [bib]

Proposes an online replay-based meta-continual learning algorithm with learning-rate modulation to mitigate catastrophic forgetting

Learning to Continually Learn , (2020) by Beaulieu, Shawn, Frati, Lapo, Miconi, Thomas, Lehman, Joel, Stanley, Kenneth O, Clune, Jeff and Cheney, Nick [bib]

Follow-up of OML. Meta-learns an activation-gating function instead.

Meta-Learning Representations for Continual Learning , (2019) by Javed, Khurram and White, Martha [bib]

Introduces Learns how to continually learn (OML) i.e. learns how to do online updates without forgetting.

Meta-learnt priors slow down catastrophic forgetting in neural networks , (2019) by Spigler, Giacomo [bib]

Learning MAML in a Meta continual learning way slows down forgetting

Learning to learn without forgetting by maximizing transfer and minimizing interference , (2018) by Riemer, Matthew, Cases, Ignacio, Ajemian, Robert, Liu, Miao, Rish, Irina, Tu, Yuhai and Tesauro, Gerald [bib]

Lifelong Reinforcement Learning

Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges , (2020) by Timothée Lesort, Vincenzo Lomonaco, Andrei Stoian, Davide Maltoni, David Filliat and Natalia Díaz-Rodríguez [bib]
Deep Online Learning Via Meta-Learning: Continual Adaptation for Model-Based RL , (2019) by Anusha Nagabandi, Chelsea Finn and Sergey Levine [bib]

Formulates an online learning procedure that uses SGD to update model parameters, and an EM with a Chinese restaurant process prior to develop and maintain a mixture of models to handle non-stationary task distribution

Experience replay for continual learning , (2019) by Rolnick, David, Ahuja, Arun, Schwarz, Jonathan, Lillicrap, Timothy and Wayne, Gregory [bib]

Continual Generative Modeling

Continual Unsupervised Representation Learning , (2019) by Dushyant Rao, Francesco Visin, Andrei A. Rusu, Yee Whye Teh, Razvan Pascanu and Raia Hadsell [bib]

Introduces unsupervised continual learning (no task label and no task boundaries)

Generative Models from the perspective of Continual Learning , (2019) by Lesort, Timoth{'e}e, Caselles-Dupr{'e}, Hugo, Garcia-Ortiz, Michael, Goudou, Jean-Fran{\c c}ois and Filliat, David [bib]

Extensive evaluation of CL methods for generative modeling

Lifelong Generative Modeling , (2017) by Ramapuram, Jason, Gregorova, Magda and Kalousis, Alexandros [bib]

Applications

CLOPS: Continual Learning of Physiological Signals , (2020) by Kiyasseh, Dani, Zhu, Tingting and Clifton, David A [bib]

a healthcare-specific replay-based method to mitigate destructive interference during continual learning

LAMAL: LAnguage Modeling Is All You Need for Lifelong Language Learning , (2020) by Fan-Keng Sun, Cheng-Hao Ho and Hung-Yi Lee [bib]
Compositional Language Continual Learning , (2020) by Yuanpeng Li, Liang Zhao, Kenneth Church and Mohamed Elhoseiny [bib]

method for compositional continual learning of sequence-to-sequence models

Unsupervised real-time anomaly detection for streaming data , (2017) by Ahmad, Subutai, Lavin, Alexander, Purdy, Scott and Agha, Zuha [bib]

HTM applied to real-world anomaly detection problem

Continuous online sequence learning with an unsupervised neural network model , (2016) by Cui, Yuwei, Ahmad, Subutai and Hawkins, Jeff [bib]

HTM applied to a prediction problem of taxi passenger demand

Thesis

Continual Learning: Tackling Catastrophic Forgetting in Deep Neural Networks with Replay Processes , (2020) by Timothée Lesort [bib]
Continual Learning with Deep Architectures , (2019) by Vincenzo Lomonaco [bib]
Continual Learning in Neural Networks , (2019) by Aljundi, Rahaf [bib]
Continual learning in reinforcement environments , (1994) by Ring, Mark Bishop [bib]

Workshops

Workshop on Continual Learning at ICML 2020 , (2020) by Rahaf Aljundi, Haytham Fayek, Eugene Belilovsky, David Lopez-Paz, Arslan Chaudhry, Marc Pickett, Puneet Dokania, Jonathan Schwarz, Sayna Ebrahimi [bib]
4th Lifelong Machine Learning Workshop at ICML 2020 , (2020) by Shagun Sodhani, Sarath Chandar, Balaraman Ravindran and Doina Precup [bib]

pythonfirst / continual_learning_papers Goto Github PK

continual_learning_papers's Introduction