Git Product home page Git Product logo

llm-continual-learning-papers's Introduction

LLM-Continual-Learning-Papers

Awesome License: MIT img

Must-read Papers on Large Language Model (LLM) Continual Learning


  1. Towards Continual Knowledge Learning of Language Models

    Joel Jang, Seonghyeon Ye, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, Stanley Jungkyu Choi, Minjoon Seo. [abs]. ICLR 2022.

  2. Continual Pre-Training Mitigates Forgetting in Language and Vision

    Andrea Cossu, Tinne Tuytelaars, Antonio Carta, Lucia Passaro, Vincenzo Lomonaco, Davide Bacci. [abs]. Preprint 2022.05.

  3. Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora

    Xisen Jin, Dejiao Zhang, Henghui Zhu, Wei Xiao, Shang-Wen Li, Xiaokai Wei, Andrew Arnold, Xiang Ren. [abs]. NAACL 2022

  4. Continual Training of Language Models for Few-Shot Learning

    Zixuan Ke, Haowei Lin, Yijia Shao, Hu Xu, Lei Shu, Bing Liu. [abs]. EMNLP 2022.

  5. Continual Pre-training of Language Models

    Zixuan Ke, Yijia Shao, Haowei Lin, Tatsuya Konishi, Gyuhak Kim, Bing Liu. [abs]. ICLR 2023.

  6. Progressive Prompts: Continual Learning for Language Models

    Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Amjad Almahairi. [abs]. ICLR 2023.

  7. A Unified Continual Learning Framework with General Parameter-Efficient Tuning

    Qiankun Gao, Chen Zhao, Yifan Sun, Teng Xi, Gang Zhang, Bernard Ghanem, Jian Zhang. [abs]. ICCV 2023.

  8. Semiparametric Language Models Are Scalable Continual Learners

    Guangyue Peng, Tao Ge, Si-Qing Chen, Furu Wei, Houfeng Wang. [abs]. Preprint 2023.02.

  9. Continual Pre-Training of Large Language Models: How to (re)warm your model?

    Kshitij Gupta, Benjamin Thérien, Adam Ibrahim, Mats L. Richter, Quentin Anthony, Eugene Belilovsky, Irina Rish, Timothée Lesort. [abs]. ICML 2023 Workshop.

  10. ConPET: Continual Parameter-Efficient Tuning for Large Language Models

    Chenyang Song, Xu Han, Zheni Zeng, Kuai Li, Chen Chen, Zhiyuan Liu, Maosong Sun, Tao Yang. [abs]. Preprint 2023.09.

  11. TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models

    Xiao Wang, Yuansen Zhang, Tianze Chen, Songyang Gao, Senjie Jin, Xianjun Yang, Zhiheng Xi, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xuanjing Huang. [abs]. Preprint 2023.10.

  12. A Study of Continual Learning Under Language Shift

    Evangelia Gogoulou, Timothée Lesort, Magnus Boman, Joakim Nivre. [abs]. Preprint 2023.11.

  13. Scalable Language Model with Generalized Continual Learning

    ICLR 2024 Conference Submission1284 Authors. [openreview]. Preprint 2023.

  14. Efficient Continual Pre-training for Building Domain Specific Large Language Models

    ICLR 2024 Conference Submission4091 Authors. [openreview]. Preprint 2023.

llm-continual-learning-papers's People

Contributors

demoleiwang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.