xyutao / fscil Goto Github PK
View Code? Open in Web Editor NEWOfficial repository for Few-Shot Class-Incremental Learning (FSCIL)
Official repository for Few-Shot Class-Incremental Learning (FSCIL)
Hi, I have read your paper "Few-Shot Class-Incremental Learning", it is a very good and interesting work, it is a groundbreaking work, and the experimental results are also very good. Recently we are also carrying out a Few-Shot Class-Incremental Learning work. We hope to compare with you in the experiment, and we hope to get your code for easy comparison. Thank you very much if you can! Mainly the following two aspects of the code
Undisclosed part of the code on Github, such as "tools.ng_anchor", "tools.loss" and "tools.plot", etc.
The three comparison methods in Table 1 are about the implementation code of few-shot class-incremental. Would you please send me the code? I want to speed up my experiment progress, thank you very much if you can.
Hello, I have a question about the role of edge 'E' in graph G indicated in the paper.
Is the collection of the edge 'E' (and age 'a') used in the learning process? I interpret that edge 'E' is not used in the learning process anywhere. Was it just used for plotting?
Thank you for your great works.
In the page 4, it mentioned that The variance Λ_j is estimated using the feature vectors whose winner is j.
Can you elaborate the meaning of the winner? Did you define the winner for variance via:
Thank you for reading it and and stay safe.
Hello, is there any related code missing? Such as "tools.ng_anchor",“tools.loss” and "tools.plot"
The work really looks great. But only problem is to apply the same method on our own data for the task of classification.Unable to understand the data preparation part for the training.
It would be very helpful if you could take a new dataset(lets say 2 classes(dog,cat)) and then incrementally train the network for another 2 classes(like horse,cow). Now the resulting model should be able to classify all 4 classes.
Looking forward for your reply in this context.
For CUB200, there is a need to load the ImageNet-pretrained ResNet18 model for initialization. However, ImageNet contains prior knowledge about birds, part of whose images are in CUB-200. I am worried that if this operation is consistent with the few-shot setting.
For datasets like CUB, do you have a test.txt? If not, how did you gather your test dataset (is it all the test samples for each class in each session)?
Thank you for your good research.
I'm practicing implementation with code as an example of your research. I got a question while practicing implementation. In the process of verifying the accuracy of the base graph I made, I wondered what percentage of accuracy should be at least to say that the base graph was well made.
By any chance, when you make the graph for base class (G1), could you tell me the accuracy for the base data set?
Hi I am interested in this few shot class incremental learning setting, I have following questions regarding to the dataloader.py:
fscil/dataloader/dataloader.py
Line 31 in 6dd827f
Hi @xyutao,
Thanks for the great work and I have a question for the ablation study of Comparison between "exemplars" and NG nodes
:
Memory
represents the number of the exemplars
and the number of nodes
you are using for knowledge representation, respectively? If yes, I am confused about how can they compare with each other?In my opinion, as the representation types are different, the definition of the Memory
for each setting may not be aligned with each other, for example, the unit storage cost of an exemplar and a node may be different, then comparing these two settings may seem to be unfair, or even they should not be comparable. Could you elaborate more on this ablation study setting, such as the motivation and the implementation details?
I really appreciate any help you can provide.
Hello,
Thank you for your interesting work.
I have a question concerning the full-shot setting. When you compared TOPIC with other state-of-the-art approaches, how many exemplars did you use for these approaches? and did you choose the exemplars randomly or by herding?
Thank you.
Thanks for your job. I began to enter the “few shot incremental” filed and not familiar with the datasets. Could you please release the datasets.py and dataloader first for reference? Thank you very much.
Hi, thanks for all of your help.
Do you have the exact numbers for the accuracies in Fig 4 of the paper?
Dear Xiaoyu,
I would like to congratulate you for your very interesting novel problem in few-shot class incremental learning. I am interested in working more in this direction and I was wondering whether it is possible to release the pre-trained network weights of your quicknet and Resnet18 for every dataset? so that fair comparisons can be made with other new methods that would like to use this work as a reference point
Hi, do you use the torchvision ResNet-18 architecture for all the datasets?
I am a bit confused since CIFAR-100 is of size 32x32 and miniimagenet is of 84x84.
Did you paste data wrongly for CIFAR100, ResNet18, 5-way 5-shot (Fig.4 (b)) in the readme file? Looks it's the same as the MiniImageNet one: miniImageNet, ResNet18, 5-way 5-shot (Fig.4 (d))?
Can I know how you calculate the accuracy for each session? I run the resnet18-ft-cnn.sh for cifar100 without -resume-from './params/CIFAR100/resnet18/baseline/baseline_resnet18_cifar_64.10.params' and I get first session(base session) best accuracy at 70.23% which is higher than the Fig.4 (b) Ft-CNN first session around 64%. Not sure why this big gap happened.
Hello, I think your work is great. I don’t know if there is any intention to publish the code.
Hi,
For your experiments, e.g. Figure 4 and Table 2, how many nodes did you save? What is the memory overhead for saving such amount of nodes? Also for iCarl and EEIL how many exemplars did you save?
I wonder how to get the result of session > 2. We will train on the 25 images for some epochs. We evaluate the model after each epoch and choose the best result or just after the last epoch?
Hello, I am repetition the code with the method in the paper ( using cifar100 dataset and quick base net), however I am a little confused now:
What is the centroid vector in the NG network you chose? Is the output of maxsoft layer? If I choose the maxsoft as my cemtroid vector (100 dims), the diagonal matrix A's value is very small like 0.000000673. So, A-1 (inverted A) calculated is very large, and the questions happens.
What is the approximate value of AL loss? Why is the value of my AL_loss are very very large?
About MML loss, ''Given the new class training set D(t) and NG G(t), for a training sample (x; y) 2 D(t), we extract its feature vector f = f(x; θ(t)) and feed f to the NG. We hope f matches the node vj whose label cj = y, and d(f; mj) � d(f; mi); i 6=
j, so that x is more probable to be correctly classified.''
f is a node belong to new classes, How can I find a node whose label y equal to f's label? Because f is the new classes' data.
fscil/model/cifar_resnet_v1.py
Line 217 in e76d37c
Hi,
It seems that you are using ResNet-20 for cifar100 (1 + 3x2 + 3x2 +3x2 +1 =20). Have I misunderstood it?
I have the honor to read your work! And I have a simple but maybe stupid question about the mechanism of neural gas. In most CIL works, they set a parameter for the size of memory with fixed capacity. I wonder how to control the the memory size for old data in neural gas (e.g. node deletion)?
Thank you for taking the time to read this!
Thanks for your great job. I am interested in your work and attempt to implement your work in pytorch but there are several problems when I am doing it. I would appreciate it if you could answer my questions.
Q1: When session t=1, how do you initialize the value of centroid vector for each NG node? use k-means or random initialization?
Q2: When calculating the anchor loss, you extract the subgraph of G(t), is there a restriction on the subgraph? And G(t) has many subgraphs,which subgraph should be chosen to calculate the anchor loss?
Thank you very much!
Hello, is there any related code missing? Such as “tools.loss” and "tools.plot"?When will the full code be released?
I have taken 100 CUB classes as base classes and learned the restnet-18 network. I followed the same setting (50 epochs) mentioned in paper and It is giving 69% as base task accuracy. But when I started with the same base classes using the NCM method (UCIR- where cosine normalization is applied on class weights and features are l2 normalized) the base task performance is giving around 74%. In the paper, base task performance is mentioned as 68.8% in both methods. Can you explain how it is possible? or any different training setting is needed for the NCM method for base class training?
Thank you.
Hi @xyutao, thanks for your amazing work on FSCIL! I can't seem to find how the training image indices for few-shot training of new classes are chosen. Is it random? I'm trying to do FSCIL on a new dataset. If it is random, is there a specific seed being used as standard practice?
Do you report the average incremental accuracy [1] which is the weighted average accuracy of only those classes that have already been trained, such as the code of SDC[2] https://github.com/yulu0724/SDC-IL/blob/master/test.py
`
if k == 0:acc_ave += acc*(float(args.base) / (args.base+task_idnum_class_per_task))
else:
acc_ave += acc(float(num_class_per_task) / (args.base+task_id*num_class_per_task))
`
[1] R. Aljundi, P. Chakravarty, and T. Tuytelaars. Expert gate: Lifelong learning with a network of experts. In CVPR, pages 3366–3375, 2017. 2, 4, 6, 8
[2] Semantic Drift Compensation for Class-Incremental Learning
When will the complete code be available?
Meanwhile, can you provide some instructions on how to run your code? For example, environment configuration, data preparation, running scripts, result analysis, etc. Besides, can you provide a PyTorch version of the code? Perhaps most people are not familiar with MXNet.Thank you very much.
Hi, thanks for sharing the training sets. I want to know how 5-way 5-shot in the evaluation stage is constructed. Do you sample 5-way and 5-shot from the training set (all classes and samples from the specific session)?
Please correct me if I'm wrong.
So let's say we are at session 2 (adding 5 classes) and there are 60+5 classes.
5-way are sampled from 65 classes and 5-shot are sampled from the training set (60 x initialsamples + 5x5). -> This is an episode.
Then for each episode, all queries from the rest of the samples (not included in the training set) are evaluated?
Thanks for your fantastic work, but I have a question about the validation dataset: whether you split the training set into two parts: train and val, when train the base model, and the number of val samples per class in base and new sessions?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.