continualai / avalanche Goto Github PK

Avalanche: an End-to-End Library for Continual Learning based on PyTorch.

Home Page: http://avalanche.continualai.org

License: MIT License

Python 80.46% Batchfile 0.01% Shell 0.13% Jupyter Notebook 19.40% Dockerfile 0.01%

continual-learning deep-learning pytorch lifelong-learning framework benchmarks strategies metrics continualai evaluation

avalanche's Introduction

Avalanche: an End-to-End Library for Continual Learning

Avalanche is an end-to-end Continual Learning library based on Pytorch, born within ContinualAI with the unique goal of providing a shared and collaborative open-source (MIT licensed) codebase for fast prototyping, training and reproducible evaluation of continual learning algorithms.

⚠️ Looking for continual learning baselines? In the CL-Baseline sibling project based on Avalanche we reproduce seminal papers results you can directly use in your experiments!

Avalanche can help Continual Learning researchers in several ways:

Write less code, prototype faster & reduce errors
Improve reproducibility, modularity and reusability
Increase code efficiency, scalability & portability
Augment impact and usability of your research products

The library is organized into four main modules:

Benchmarks: This module maintains a uniform API for data handling: mostly generating a stream of data from one or more datasets. It contains all the major CL benchmarks (similar to what has been done for torchvision).
Training: This module provides all the necessary utilities concerning model training. This includes simple and efficient ways of implement new continual learning strategies as well as a set of pre-implemented CL baselines and state-of-the-art algorithms you will be able to use for comparison!
Evaluation: This module provides all the utilities and metrics that can help evaluate a CL algorithm with respect to all the factors we believe to be important for a continually learning system. It also includes advanced logging and plotting features, including native Tensorboard support.
Models: This module provides utilities to implement model expansion and task-aware models, along with a set of pre-trained models and popular architectures that can be used for your continual learning experiment (similar to what has been done in torchvision.models).
Logging: It includes advanced logging and plotting features, including native stdout, file and TensorBoard support (How cool it is to have a complete, interactive dashboard, tracking your experiment metrics in real-time with a single line of code?)

Avalanche the first experiment of an End-to-end Library for reproducible continual learning research & development where you can find benchmarks, algorithms, evaluation metrics and much more, in the same place.

Let's make it together 🧑‍🤝‍🧑 a wonderful ride! 🎈

Check out below how you can start using Avalanche! 👇

Quick Example

import torch
from torch.nn import CrossEntropyLoss
from torch.optim import SGD

from avalanche.benchmarks.classic import PermutedMNIST
from avalanche.models import SimpleMLP
from avalanche.training import Naive


# Config
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# model
model = SimpleMLP(num_classes=10)

# CL Benchmark Creation
perm_mnist = PermutedMNIST(n_experiences=3)
train_stream = perm_mnist.train_stream
test_stream = perm_mnist.test_stream

# Prepare for training & testing
optimizer = SGD(model.parameters(), lr=0.001, momentum=0.9)
criterion = CrossEntropyLoss()

# Continual learning strategy
cl_strategy = Naive(
    model, optimizer, criterion, train_mb_size=32, train_epochs=2,
    eval_mb_size=32, device=device)

# train and test loop over the stream of experiences
results = []
for train_exp in train_stream:
    cl_strategy.train(train_exp)
    results.append(cl_strategy.eval(test_stream))

Current Release

Avalanche is a framework in constant development. Thanks to the support of the ContinualAI community and its active members we are quickly extending its features and improve its usability based on the demands of our research community!

A the moment, Avalanche is in Beta. We support several Benchmarks, Strategies and Metrics, that make it, we believe, the best tool out there for your continual learning research! 💪

You can install Avalanche by running pip install avalanche-lib.
This will install the core Avalanche package. You can install Avalanche with extra packages to enable more functionalities.
Look here for a more complete guide on the different ways available to install Avalanche.

Getting Started

We know that learning a new tool may be tough at first. This is why we made Avalanche as easy as possible to learn with a set of resources that will help you along the way. For example, you may start with our 5-minutes guide that will let you acquire the basics about Avalanche and how you can use it in your research project:

Getting Started Guide

We have also prepared for you a large set of examples & snippets you can plug-in directly into your code and play with:

Avalanche Examples

Having completed these two sections, you will already feel with superpowers ⚡, this is why we have also created an in-depth tutorial that will cover all the aspects of Avalanche in detail and make you a true Continual Learner! 👩‍🎓

From Zero to Hero Tutorial

Cite Avalanche

If you use Avalanche in your research project, please remember to cite our JMLR-MLOSS paper https://jmlr.org/papers/v24/23-0130.html. This will help us make Avalanche better known in the machine learning community, ultimately making a better tool for everyone:

@article{JMLR:v24:23-0130,
  author  = {Antonio Carta and Lorenzo Pellegrini and Andrea Cossu and Hamed Hemati and Vincenzo Lomonaco},
  title   = {Avalanche: A PyTorch Library for Deep Continual Learning},
  journal = {Journal of Machine Learning Research},
  year    = {2023},
  volume  = {24},
  number  = {363},
  pages   = {1--6},
  url     = {http://jmlr.org/papers/v24/23-0130.html}
}

you can also cite the previous CLVision @ CVPR2021 workshop paper: "Avalanche: an End-to-End Library for Continual Learning".

@InProceedings{lomonaco2021avalanche,
    title={Avalanche: an End-to-End Library for Continual Learning},
    author={Vincenzo Lomonaco and Lorenzo Pellegrini and Andrea Cossu and Antonio Carta and Gabriele Graffieti and Tyler L. Hayes and Matthias De Lange and Marc Masana and Jary Pomponi and Gido van de Ven and Martin Mundt and Qi She and Keiland Cooper and Jeremy Forest and Eden Belouadah and Simone Calderara and German I. Parisi and Fabio Cuzzolin and Andreas Tolias and Simone Scardapane and Luca Antiga and Subutai Amhad and Adrian Popescu and Christopher Kanan and Joost van de Weijer and Tinne Tuytelaars and Davide Bacciu and Davide Maltoni},
    booktitle={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition},
    series={2nd Continual Learning in Computer Vision Workshop},
    year={2021}
}

Maintained by ContinualAI Lab

Avalanche is the flagship open-source collaborative project of ContinualAI: a non-profit research organization and the largest open community on Continual Learning for AI.

Do you have a question, do you want to report an issue or simply ask for a new feature? Check out the Questions & Issues center. Do you want to improve Avalanche yourself? Follow these simple rules on How to Contribute.

The Avalanche project is maintained by the collaborative research team ContinualAI Lab and used extensively by the Units of the ContinualAI Research (CLAIR) consortium, a research network of the major continual learning stakeholders around the world.

We are always looking for new awesome members willing to join the ContinualAI Lab, so check out our official website if you want to learn more about us and our activities, or contact us.

Learn more about the Avalanche team and all the people who made it great!

avalanche's People

Contributors

Stargazers

Watchers

Forkers

andreacossu ggraffieti lrzpellegrini antoniocarta jeremyforest trendingtechnology ankitshah009 msrocean rachmadvwp mattiasangermano julioushurtado subutai aterterian ryanlindeborg oyt9306 rrmina pkraison cela96 mattdl 1boch1 tyler-hayes shikhar-srivastava zeta1999 trinh-hoang-hiep vlomonaco feddybear rgmoller mingzailao o7s8r6 edwardnguyen1705 chanr-analytics kwcooper dsoselia pandinosaurus matianbao tomveniat verwimpeli arylwen andrearosasco vishalbelsare lebrice adbmd moenaga ricklentz gunjanrt04 prashant118 nchuramani mohzulfikar stjordanis wanyili mathieu4141 ayush1399 nicklucche sagar16812 timmhess digantamisra98 bharathvarma008 psbd christophalt yangjirui yangxue0827 linhduongtuan wanghd-mvp jizongfox giang12 sayedmaheenbasheer wonmin-byeon tachyonicclock wutong8023 espizo amalapuram nuwangunasekara ashok-arjun gab709 curiszhou kaustubholpadkar ektagavas albinsou anerirana dannyfgithub hikmatkhan jhonathan-pedroso vincentwei2021 xdotproduct junfenggo lxqpku zohaibrizvi michalbortkiewicz hamedhemati rudysemola kiminh trenta3 hejleo rahullabs leatherking linzhiqiu ashijoshi zalakbhalani niniack saibezugam

avalanche's Issues

Add MAC metric

It would be nice to have also a MAC metric. It's difficult to compute it in native pytorch for every possible layer, but it would be a nice, hardware independed metric.

Any idea on how to do this easily in PyTorch?

Add Timing Metric

We should add a simple metric that keeps track of the elapsed time of the experiment.

Add CUB200 to the Benchmarks

Add Learning without Forgetting Strategy

Add a basic LwF strategy.

Automatic testing, style checking and deployment

We should release Avalanche with a proper Continuous Integration system.

The CI setup should include:

Build & Test (of master and pull requests)
Package deployment (pypi?, conda?)
Docs deployment

@ggraffieti is already working on #3 for docs: I think that it'll be a good starting point.

Major obstacles are:

Defining a complete (Travis?) CI setup
Define tests for each and every part of Avalanche!!
Define the release channels

I'm creating this issue as "low priority" but we should definitely consider releasing the 0.1.0 version of Avalanche not before a decent CI setup has been defined.

Add generic New Classes scenario manager

Already implemented in my private codebase, working on porting it to Avalanche.

This class will allow the user to create a NC (New Classes) scenario given a couple of generic train and test Datasets.

The user will be able to create a manager instance that will be an iterable. This iterable will output the incremental "task"s or "batch"es (terminology to be defined) and will also allow the user to execute certain task/batch complex management operations.

This is very similar to the current loader being implemented in the Avalanche codebase, but will allow the user to plug in his/her own dataset. Also, due to being extremely generic, this will speed-up the integration of new datasets.

The code I've already implemented in my private codebase works fine, but is complex. I'm working on slimming it down a bit. Here is a list of already implemented features, please feel free to comment if you feel we need even more features!

Features (already implemented):

Get current/cumulative/task-specific train/test datasets
Variable number of tasks (or "incremental batches" for task-free scenarios)
Allow the user to customize the number of classes in each task
Class shuffling given a seed
Ability to define a fixed class order (for results reproducibility)
Remapping class original IDs to range(0, n_classes) (very useful when creating confusion matrices and for algorithms based on dynamic head expansion which require class idxs in ascending order)

Side features (already implemented):

Wrapper Dataset class that makes any Dataset sliceable and funny-indexable
Wrapper Dataset class that makes any Dataset transformable (like the ones in torchvision)

To be defined (even in future development phases):

Terminology: the manager will output "task"s. That is, training/test sets made only of patterns of certain classes. The question is: is the "task" terminology "ok"? Or is too much related to task-oriented scenarios? Consider that, apart from the terminology considerations, the users will be able to use this manager both in multi-task and task-free (single incremental task) scenarios...

Feel free to comment.

Keep up with the excellent work you've been doing!

Add Tiny-ImageNet

Add Tiny-ImageNet to the benchmarks.

AnimalWeb dataset

Could be an interesting dataset to have / explore.
paper here: https://arxiv.org/abs/1909.04951

what do you think ?

Data Loaders should have the same API

Check if all data loaders have the same API: In particular, they should return PyTorch tensors, not numpy!

Add AR1 to the CL Strategies

Add the AR1 strategy to the other available baselines.

CWR* and AR1 optimizer initialization

CWR* and AR1 can either use an optimizer passed by the caller, or create one with the lr and momentum parameters.

I think the optimizer should not be an argument because right now they are silently overriding the user's choice. Let me know if there is any reason for this otherwise I will remove the parameter.

Expand Readme Description

Create Sphinx Doc with Trevis CI

Create the sphinx auto documentation on gh-pages to be build with Travis CI.

Add ICaRL to the CL Baselines

Add the ICaRL strategy to the main CL Baselines.

Metrics and EvalProtocol API

Metrics and EvalProtocol are a little bit unclear to me.

What is EvalProtocol's job? Most of the code implements Tensorboard logging operations but the name hints to something more than that.
Right now metrics do not have a uniform API, and each one takes different argument for the compute method. Each time we add a new metric, we also have to add a new if case inside EvalProtocol's get_results.

I would prefer a generic EvalProtocol that controls printing and logging and only delegates the computations to the metrics (e.g. instead of printing inside compute EvalProtocol calls the __str__ method). I would also prefer to be able to choose where to print the metrics (output file, tensorboard, stdout).

Add Tensorboard Logging Object

We need a custom Tensorboard logging class to pass to the evaluation protocol class. This class is important to give the user more control over tb logging regardless of the chosen metrics.

pytorchcv ImportError

I created a new environment for the project.
Using the environment.yml file I was not able to install pytorchcv through conda.

I used pip to install pytorchcv 0.0.58.

However, the example in examples/getting_started.py is not working for me.

ImportError: cannot import name 'DwsConvBlock' from 'pytorchcv.models.mobilenet' (/home/carta/anaconda3/envs/avalanche-env/lib/python3.8/site-packages/pytorchcv/models/mobilenet.py)

Maybe I need a different version of the package?
@vlomonaco or anyone else that is able to run the examples, can you tell me you pytorchcv version?
you can use pip list | grep pytorchcv.

I do not know the library but from a quick look at the implementation it seems that they refactored their code, changing the name and location of the convolutional blocks.

Google Colab Tutorial Notebooks

Create a usage tutorial for google colab using the conda environment.

Automatically download CORe50 instead of requiring user to download it.

Unlike other datasets like MNIST, CORe50 must be explicitly downloaded from the user. It would be better to have the same interface for all datasets.
Moreover, this behavior also breaks the script examples/simple_core50.py, since the provided data folder does not exist.

Add Basic Rehearsal Strategy

Add and test a basic rehearsal strategy!

Add generic New Instances scenario manager

Similar to the generic New Classes manager, a New Instance manager will allow us to streamline the creation of New Insances benchmarks.

The NI Manager should mainly focus on the SIT scenario.

Key features:

Support for PyTorch Datasets
Support for the SIT scenario
Different options to balance class distribution among batches

The NI manager should also include features found in the NCScenario. For instance:

Getter for current/past/growing/future train and test sets
List of already encountered classes (considering this is a NI scenario, a counter of n. patterns for each class should be exposed to the user)
Customizable size (n. of patterns) of incremental batches
...

CPU/GPU process usage Metric

We need to add the metric related to the CPU/GPU consumption over time.

LWF "warmup_train" fun never used

Hi @AntonioCarta, I've noticed this function in the LearningWithoutForgetting class is never used, do we need it?

Add tests

Add unit tests using unittest.

Add CWR* to the Strategies

Create generic datasets generator given filelists

This will be useful to create a "scenario" that follows the NCGenericScenario API and works with any CL dataset based on filelists.

Strategy interface proposal

I am starting to play around with the Strategy class and I would like to propose some changes:

We need a method to change the loss function and add regularizers. Currently, if you want to define a regularization-based method you must necessarily redefine the train method with the entire training loop. This causes a lot of code duplication. I would like to have a compute_loss method that gets called inside train.
The callbacks (after_train, before_train, ...) are all implemented as abstract methods. This means that each new strategy must define the method itself. Most of the strategies will implement these as empty methods. Can't we give a default empty implementation?
The multi_head is not used. What is it doing?

Finally, I think that we should try to separate logging code (tensorboard, print statements) from training code. The EvalProtocol should be the only one doing the logging. However, this is less urgent right now.

I can do the changes, but first I wanted to discuss them with you.
Notice that already existing code is not affected by these changes.

Improve tests for the Benchmark module

Add additional tests for each continual data loader.

Fix Strategy API to suit the new Datasets API

The Strategy API should be changed to suit the new Datasets API, i.e. it should be able to be trained on a batch_info object. Since all the stratgies will break based on this, we can create a new Strategy file for now!

Canonical Correlation Analysis (CCA) Metric

We need to add the CCA metric (described here). Adding it only in Tensorboard would be fine I think.

Duplicated ICifar100 dataset

ICifar100 dataset is present both in benchmarks/cdata_loaders/cifar_split.py and in benchmarks/cdata_loaders/icifar100.py. The versions however are different.
Is there a reason to keep two different versions? Otherwise we should either keep the correct version or merge them into icifar100.py.

Moreover, in cifar_split.py there is a typo in get_grow_test_set method, which should be get_growing_testset in order to be compliant with the avalanche interface.

Figures are never closed in CM metrics

Matplotlib complains that figures are left open when creating images of Confusion Matrices.

The exact warning is:

avalanche/evaluation/metrics.py:292: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (matplotlib.pyplot.figure) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam figure.max_open_warning).

We should consider closing them in a "finally" block.

More clear visualization of build results

When a pull request is opened, it is build by travis CI to evaluate the correctness of the code. By now travis only signal if a build is passed ot failed, without any further information (e.g. why is failed, what are the error ecc..). It would be nice to have more feedback from travis, in order to immediately know why a build is failed without the need to open travis and inspect the console.

Search for CI bots that display the errors in the pull request discussion.
Split the build in many parts, one for linting, one for test, one for documentation, in order to know what is wrong.
Have a look at circleCI or other alternative build systems.

New Usage Examples

Add additional usage examples in the "examples" directory as a showcase of the avalanche functionalities.

Add FashionMNIST to the benchmarks

Disk Usage Metric

We need to add the metric related to the disk usage: it would be nice to have both I/O usage and a check the size of an additional directory to be used by each strategy to store things that are not in RAM.

Add Features statistics in Tensorboard

We need to add some features statistics in tensorboard.
The user should be able to specify them during the creation of the tensorboard object. For now the TensorboardLogging object is created in eval_protocol.py

Unify Implementation Generic Scenarios API

Create Conda Package from GitHub

Check if it's possible to create a conda package (with dependencies) from GitHub.

Add Synaptic Intelligence to the Strategies

CIFAR-10 Dataset

We need to add the CIFAR-10 dataset (split of 2 classes each).

Add OpenLORIS to the Benchamarks

Add EWC as CL Strategy

Add the EWC strategy as additional CL baseline.

EvalProtocol not working on Split MNIST

I tried to run getting_started.py with Split MNIST instead of Permuted MNIST but EvalProtocol is crashing.
It is probably some problem caused by the missing classes in each task but I didn't look deeply at the code yet.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/avalanche/examples/getting_started.py in <module>
     59
     60     # testing
---> 61     results.append(clmodel.test(test_full))

~/avalanche/avalanche/training/strategies/strategy.py in test(self, test_set)
    183             self.after_task_test()
    184
--> 185         self.eval_protocol.update_tb_test(res, self.batch_processed)
    186
    187         self.after_test()

~/avalanche/avalanche/evaluation/eval_protocol.py in update_tb_test(self, res, step)
    100             in_out_scalars = {
    101                 "in_class": np.average(in_class_diff),
--> 102                 "out_class": np.average(out_class_diff)
    103             }
    104

<__array_function__ internals> in average(*args, **kwargs)

~/anaconda3/lib/python3.7/site-packages/numpy/lib/function_base.py in average(a, axis, weights, returned)
    391
    392     if weights is None:
--> 393         avg = a.mean(axis)
    394         scl = avg.dtype.type(a.size/avg.size)
    395     else:

~/anaconda3/lib/python3.7/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
    149             is_float16_result = True
    150
--> 151     ret = umr_sum(arr, axis, dtype, out, keepdims)
    152     if isinstance(ret, mu.ndarray):
    153         ret = um.true_divide(

ValueError: operands could not be broadcast together with shapes (4,) (6,)

Add GEM and A-GEM to the Strategies

Add Progressive network strategy

I think it could be useful to add this quite popular approach. It could highlight pros and cons when monitored through the Memory and CPU Usage metrics.

Add, improve and consolidate API documentation

Before releasing our project to the public, the API should be well documented and easy to understand.

In many parts of the project the documentation is totally lacking or is very poor. Every class, function and public method should be well documented, with a completely description of parameters, return value etc.
In some part of the project the documentation exists, but it's incomplete or should be improved.
The documentation style is inconsistent, in some part is written in the reST (reStructuredText) format, in other part in the Google style format. We've chosen reST as the preferred code documentation style, so a conversion of all the apidoc in reST is necessary.