continvvm / continuum Goto Github PK
View Code? Open in Web Editor NEWA clean and simple data loading library for Continual Learning
Home Page: https://continuum.readthedocs.io
License: MIT License
A clean and simple data loading library for Continual Learning
Home Page: https://continuum.readthedocs.io
License: MIT License
l'argument "train" devrait etre retirer du loader.
et le init du pytorchdataset devrait etre sans argument
I have tested your snippet (provided bellow) on two devices and I received this error:
from torch.utils.data import DataLoader
from continuum import ClassIncremental
from continuum.datasets import MNIST
clloader = ClassIncremental(
MNIST("my/data/path", download=True),
increment=1,
initial_increment=5,
train=True # a different loader for test
)
print(f"Number of classes: {clloader.nb_classes}.")
print(f"Number of tasks: {clloader.nb_tasks}.")
for task_id, train_dataset in enumerate(clloader):
train_dataset, val_dataset = split_train_val(train_dataset)
train_loader = DataLoader(train_dataset)
val_loader = DataLoader(val_dataset)
and the error is:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-6-c5312f5c2b3b> in <module>()
1 for task_id, train_dataset in enumerate(clloader):
----> 2 train_dataset, val_dataset = split_train_val(train_dataset)
3 train_loader = DataLoader(train_dataset)
4 val_loader = DataLoader(val_dataset)
/usr/local/lib/python3.6/dist-packages/continuum/task_set.py in split_train_val(dataset, val_split)
118
119 x, y = dataset.x, dataset.y
--> 120 train_dataset = TaskSet(x[train_indexes], y[train_indexes], dataset.trsf, dataset.open_image)
121 val_dataset = TaskSet(x[val_indexes], y[val_indexes], dataset.trsf, dataset.open_image)
122
AttributeError: 'TaskSet' object has no attribute 'open_image'
May I ask you what is the problem?
Should be:
InMemoryDataset(x,y)
not
InMemoryDataset(x_train,y_train, x_test, y_test)
I think it would make it clearer
Continuum is already taken on PyPi, by a CI project from 2014.
We need a name for the PyPi, alternative choices:
lifelong
deepcontinuum
torchcontinuum
EDIT: the owner of Continuum
should give us soon the name.
It was selected such as forcing balance between classes, but in this case we cannot detect class imbalance in tasks.
NB: This issue should be solved in #29
only the dataset is mentionned but not the scenarios, no?
Traceback (most recent call last):
File "main_cl_cifar10_100_scl_prototype_inference_end_to_end.py", line 23, in
from continuum.task_set import split_train_val
File "/home/mohammad/.local/lib/python3.6/site-packages/continuum/task_set.py", line 8, in
from continuum.viz import plot
ImportError: cannot import name 'plot'
For example as a camera that would film object. We can not choose data order (or if we want to do so, we need to save data in the algo).
It would be some kind of real-life loader, where you don't have much control of data
There is an error when download is set to False
Core50("/data/douillard/CORe50", download=False, train=True)
However, it works with
Core50("/data/douillard/CORe50", download=True, train=True)
Need #8
The Exception is raised if "not nb_tasks.is_integer()" an the message is :
"The tasks won't have an equal number of classes"
f" with {len(self.class_order)} and increment {increment}"
I suspect a mismatch between exception condition and message :)
Hi,
May I ask you to provide me a snippet for loading data related to the permuted MNIST scenario? I assumed that the following one does the same. Is it right?
from continuum.datasets import MNIST
from continuum import InstanceIncremental
clloader = InstanceIncremental(MNIST(args.data, download=True),
nb_tasks=10,
train=True)
Hi, I tried to investigate how continual learning can be implemented in my NLP classification project. When I tried to see the MultiNLI first and tried to figure out what inside the train_loader using for loop, but an error occured:
dataset = MultiNLI('./data')
scenario = InstanceIncremental(dataset)
for task_id, train_taskset in enumerate(scenario):
print(task_id, dir(train_taskset))
train_taskset, val_taskset = split_train_val(train_taskset, val_split=0.1)
train_loader = DataLoader(train_taskset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_taskset, batch_size=32, shuffle=True)
for x,y,t in train_loader:
print(x, y, t)
For full code:
https://colab.research.google.com/drive/1R8rYCo-0wzoiIUTE64Pko-GtfwbGGQ9C#scrollTo=R2SHCM83-Omg
For now, reharsal samples are mixed with others. We need a way to differentiate them.
Hey there, thanks for your work!
imageio is used only at one place, in viz.py
, and isn't listed as a dependency in the setup.py, therefore installing continuum via pip install continuum
*works fine, but importing it raises an error
Tests for transformed are currently quite slow because it applies transformations on the whole MNIST.
We can speed it up by either:
We may have to create some dummy data for large datasets such as ImageNet.
Need #8
Are you planning on adding other datasets?
Need to support:
For at least the datasets VOC and ADE20k.
When trying to run the example code, I get the following error:
from torch.utils.data import DataLoader
from continuum import ClassIncremental, split_train_val
from continuum.datasets import MNIST
clloader = ClassIncremental(
MNIST("my/data/path", download=True),
increment=1,
initial_increment=5,
train=True # a different loader for test
)
print(f"Number of classes: {clloader.nb_classes}.")
print(f"Number of tasks: {clloader.nb_tasks}.")
for task_id, train_dataset in enumerate(clloader):
train_dataset, val_dataset = split_train_val(train_dataset, val_split=0.1)
train_loader = DataLoader(train_dataset)
val_loader = DataLoader(val_dataset)
for x, y in train_loader:
print("Never gets here?")
exit()
# Do your cool stuff here
Number of classes: 10.
Number of tasks: 6.
Traceback (most recent call last):
File "foo.py", line 20, in <module>
for x, y in train_loader:
File "/home/fabrice/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
data = self._next_data()
File "/home/fabrice/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/home/fabrice/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/fabrice/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/fabrice/Source/SSCL/utils/continuum/continuum/task_set.py", line 96, in __getitem__
t = self.t[index]
TypeError: 'Compose' object does not support indexing
listing data index of each task from the beginning (for NI or NIC scenarios)
NIC scenarios
Hi @arthurdouillard,
Could you please tell me that your data loader API supports the CIFAR-10/100 experiment? This is an experiment I found in a paper called "continual learning with hyper-networks". BTW, I found there is a module named Fellowship which provides such a combination capability but I am not sure. May I ask you verify it?
One possible solution is to have a binary matrix (nb_tasks * len(y)) that gives at which task(s) own each data point...
An example in the parameters description could be nice
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.