Yue Wang's personal website
rfs's Introduction
rfs's People
Forkers
codeaudit lliai chaoso brjathu ml-lab georgosgeorgos wh-forker sty-yyj kiminh patricieni wishgale scotthowland brando90 xhchrn polyrand tor4z yihaochen96 godencrystal zdstandup dongzhuoyao liam0949 spetryk msoliman6 yuanwanglll sebamenabar yx-guo sonaalkant moukamisama yif-yang 9p15p aumgli ccfbupt zj717754140 harzva howyoungchen vicayang kangli-bionic milkigit henkwu luisaply ashok-arjun qinzhengmei jjavierga yitewang go-getter666 brothersh z1358 kylitt akaysh lourisxu yongwuml chaof94 ashantanu ahhsitt cetinsamet dhruvgupta1999 mbmb7777 integritynoble wangyitian123 zjgans hwankang banalasaritha mehakkhan7 alex-linhares nlhuangrfs's Issues
why does imagenet need drop blox size 5 but cifar need dropbox size 2?
def get_resnet_rfs_model_mi(model_opt: str,
avg_pool=True,
drop_rate=0.1,
dropblock_size=5,
num_classes=64,
) -> tuple[nn.Module, dict]:
"""
Get resnet_rfs model according to the string model_opt
e.g. model_opt = resnet12
ref:
- https://github.com/WangYueFt/rfs/blob/f8c837ba93c62dd0ac68a2f4019c619aa86b8421/models/util.py#L7
"""
model_hps: dict = {'avg_pool': avg_pool,
'drop_rate': drop_rate,
'dropblock_size': dropblock_size,
'num_classes': num_classes}
model: nn.Module = model_dict[model_opt](avg_pool=avg_pool,
drop_rate=drop_rate,
dropblock_size=dropblock_size,
num_classes=num_classes)
return model, model_hps
def get_resnet_rfs_model_cifarfs_fc100(model_opt: str,
num_classes,
avg_pool=True,
drop_rate=0.1,
dropblock_size=2,
) -> tuple[nn.Module, dict]:
"""
ref:
- https://github.com/WangYueFt/rfs/blob/f8c837ba93c62dd0ac68a2f4019c619aa86b8421/models/util.py#L7
"""
model_hps: dict = {'avg_pool': avg_pool,
'drop_rate': drop_rate,
'dropblock_size': dropblock_size,
'num_classes': num_classes}
model: nn.Module = model_dict[model_opt](avg_pool=avg_pool,
drop_rate=drop_rate,
dropblock_size=dropblock_size,
num_classes=num_classes)
return model, model_hps
About Multi-way and Multi-task comparison
In section 4.8, your paper demonstrated the comparison between multi-task and multi-way classification pretrain settings to show multi-way is better. I am a little bit confused about the setting for Multi-task pretrain. Does that mean we are going to split the data from one batch to form a couple of classification tasks, like 4NK images can be split in to 4 N-way K-shot tasks? Appreciate it if you can detail the settings for multi-task pretrain.
Issue with validation set for cifarfs?
pre-trained models
Hi,
Would it be possible to make available the pre-trained models, with the different data-sets?
e.g.
- checkpoint of model trained with union data-sets on mini-imagent
- checkpoint of model trained with MAML
etc.
Or whatever you have available.
Dumping them in a dropbox/gdrive would be amazing!
error using meta split train
error:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/brandomiranda/data/miniImageNet_rfs/miniImageNet/miniImageNet_category_split_train.pickle'
what is the contrastive sampling pre-processing for?
I was this:
Line 52 in f8c837b
I don't recall reading that in the paper. Can you clarify what is this? It also doesn't seem standard. Can you clarify?
All checkpoints from authors
Hi,
I was wondering if the remaining checkpoints could be provided. I'm particularly interested in the cifarfs checkpoints.
Where can I get this file teacher.pth?
when I perform this command:
python train_distillation.py -r 0.5 -a 0.5 --path_t /path/to/teacher.pth --trial born1 --model_path /path/to/save --tb_path /path/to/tensorboard --data_root /path/to/data_root
The error is :
NotImplementedError: model to not supported in dataset mini-imagenet:
I think that the reason is that I fail to find this file: teacher.pth.
Is this file(teacher.pth) automatically generated?
Where is the tb_logger?
Hi,
I think to reproduce your results from your code we need:
Line 9 in e31e209
the tensorboard_logger.
Can you share that please?
Thanks!
transformation used for tiered imagenet
Thanks for the great work!
I noticed the default transformation is A, which is also used in the mini-imagenet example run script. I am just wondering is it same for tiered imagenet? I asked this because I saw the transform B and C in transform_cfg.py are never used in the code.
Thanks!
resnet vs resnet_new?
Hi,
What is the difference?
Tiered ImageNet High-Level Categories and Train-Phase-Val
Hello,
In the data files you released for Tiered ImageNet there are no labels for the high-level categories, only the low-level ones. How can I get them?
It also seems, that in comparison to the original tiered imagenet release, there has been a relabeling of some sort, since the datasets don't exactly match. This means that I can't infer the high-level classes using that repository.
Additionally, what exactly is the "train-phase-val.pkl" partition made of? It seems that it doesn't come from tieredImageNet (Since there's no parallel in the original).
Pretrained Model
Hi, Yue! Will you release the pre-trained model for tieredImageNet, including use trianval and trian_only? I found to use the option --use_trainval the training processing seems weird that in the testing stage in every 100 episodes, the accuracy is 0.1-0.2. I use your code trian_supervised. It performs well using MiniImageNet data.
Thanks.
How to get minisample.pth ?
Is this model trained by yourself? How did you get it?
providing the pickle files available and the rest of the data
Hi,
can you provide code to be able to get the data from this repo? It's not exactly reproducible in it's current format.
about the training
In training phrase:
1. first, merge a big and full data-set
2. second, train a ordinary res-net classification model
3. finally, discard the soft-max layer, keep the embedding
that's all the training ?!
To test the few-shot problem, suppose given 20 classes, and 10 samples for each class,
construct a test N-way K-shot meta-task as follows:
1. random pick N=5 classes, and random pick K=5 samples from each corresponding class as the train set, 25 in total
2. random pick, e.g. 3 sample, from each class, to construct a test set with 15 samples within
3. now we have a meta-task, then train linear classification with feature extracted from train set, using the embedding model above, then calculate the error on the test set.
Repeat the above 3 steps M times, merge and get a final test error.
that's all the test ?!
Pretrained Model
Hello, could you provide the weight of pretrained model?
Best wishes for you
Dataset preparation
Hi Wang,
Thank you for sharing this repo.
I have a question regarding the dataset, if I have a dataset is split into:
- train
- valid
- test
So they are the same selected files in this snapshot:
But what about the other three files?
- miniImageNet_category_split_train_phase_val.pickle
- miniImageNet_category_split_train_phase_trainval.pickle
- miniImageNet_category_split_train_phase_test.pickle
I need to train the model on my dataset for a classification image dataset. But I have the usual split dataset, what other files represented? so I can prepare my dataset.
Thanks
Why using a different Conv4?
Dear author,
As far as I know, Conv4 in many literatureas refers to four convolutional blocks, each of which consists of a convolutional layer, a batch normalization layer, a Leaky ReLU layer and a 2×2 max-pooling layer. Why do you remove the max-pooling layer of the last layer and use a self.avgpool = nn.AdaptiveAvgPool2d(1)
?
Looking forword to your reply.
The performance issue in dataloader
In the __getitem__
method, the authors call min(self.labels)
, which could be extremely slow...
Also, it assumes that the label in self.labels
is "continuous", which may bring unexpected bugs when changing to another dataset.
I suggest converting self.labels
into the corresponding ordinal number to avoid the performance issue and possible bugs.
the difference accuracy with baseline
Hi authors,
Thanks for releasing your codes!
Can I ask a question what the difference between your methods from "A Closer look at few-shot learning"? The two methods are both train pre-trained models and use the representation to do few-shot testing (ignore distillation part).
The accuracy is quite different. "a closer look" for resnet18 baseline miniImagenet: 42.11±0.71, while your paper resnet12 62.02/ resnet18 57.56.
(the difference between the two papers shouldn't be that large)
Thanks.
A potential issue that might affect comparison fairness
Lines 126 to 131 in f8c837b
In experiment, the flag of
num_workers=3
in meta_testloader, it will assign 3 threads to fetch data, however, these threads have the same seed for np.random
, which means it will sample 3 similar task, the 3 task has the same class, the same support images and the same query images, so it might be unfair to other methods although it is irrelevant to the performance.By the way, setting
worker_it_fn
for each thread or num_workers<=1 might be a remedy.Results about Meta-Dataset
Thank you for your repo.
Could you provide the code about the results of Meta-Dataset?
Meta-Dataset
Hi @HobbitLong
First, thank you for sharing this repo.
I was wondering if you are planning on providing the code for the Meta-Dataset results,
Thank you.
why have custom resnets and not the pytorch resnets
@WangYueFt @HobbitLong I was curious, why did you implement your own resnets instead of using the ones already available in pytorch? is there anything bad about those for meta-learning/few-shot learning?
License missing
Hi,
could you clarify under which license is this code being released?
Thanks.
about training on tieredImageNet
Thanks for your excellent work!
When I training on the tieredImageNet dataset, the train_phase_val accuracy only 0.5%, but the training accuracy is normal. Does it right?
what does the pretrain=True flag do in the mini-imagenet data loader?
@WangYueFt @HobbitLong what does the pretrain=True
flag do?
e.g.
if self.pretrain:
self.file_pattern = 'miniImageNet_category_split_train_phase_%s.pickle'
else:
self.file_pattern = 'miniImageNet_category_split_%s.pickle'
Line 42 in f8c837b
code Warning how to deal with it?
UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1603729047590/work/torch/csrc/utils/tensor_numpy.cpp:141.)
img = torch.from_numpy(pic.transpose((2, 0, 1))).contiguous()
the warning is in /rfs/dataset/miniimagenet?The transformation betweennumpy and PIL is repeated,but I do not know how to deal with it。
resnet12 does not work with cifar-fs data set given here
issue:
gamma = (1 - keep_rate) / self.block_size ** 2 * feat_size ** 2 / (feat_size - self.block_size + 1) ** 2
ZeroDivisionError: float division by zero
python-BaseException
from:
feat_size - self.block_size + 1
PyDev console: starting.
0
About comparing supervised vs Moco&CMC
In table 3 of your paper, "supervised" model uses ResNet-50 as backbone, with accuracy of 73.81% in miniImagenet 5-way. However, "Ours-simple" model in table 1 with ResNet-12 reports 79.64%.
But as you mentioned in section 4.9, results should improve with better backbone networks.
Can you explain why the accuracy with ResNet-50 is lower than ResNet-12?
are you using avg pool in your expts?
Why are you augmenting your support set and why is not cheat?
Why do you have this?
Line 48 in f8c837b
my impression was that in when one does an n-way, k-shot task one only uses k-shots -- but this number increases the shots. Wouldn't this be cheating?
args.n_aug_support_samples = 5
...
support_xs.size()=torch.Size([125, 3, 84, 84])
query_xs.size()=torch.Size([75, 3, 84, 84])
No aug in support set
args.n_aug_support_samples = 1
...
support_xs.size()=torch.Size([25, 3, 84, 84])
query_xs.size()=torch.Size([75, 3, 84, 84])
perhaps this is why I can't reproduce and the values reported in the paper are larger than mine -- even when I use the rfs mini imagenet checkpoint.
Doubt regarding training settings
So according to the paper the weight for distillation and classification loss is 0.5 and 0.5 but in the script provided it is 1 and 0.5 . Which one is better?
For tiered imagenet, Cifar-fs and fc100 datasets are the batch size kept same as 64?
Could you provide the scripts for the datasets other than mini-imagenet?
How to reproduce the result on the paper?
Thanks for your excellent work!I encountered some problems when I wanted to reproduce your work on miniimagenet.
first I run the command:
CUDA_VISIBLE_DEVICES=0 python train_supervised.py --trial pretrain --model_path ./checkpoint --tb_path ./tensorboard --data_root ./mini
then I run next command:
CUDA_VISIBLE_DEVICES=2 python train_distillation.py -r 0.5 -a 1.0 --path_t ./checkpoint/resnet12_miniImageNet_lr_0.05_decay_0.0005_trans_A_trial_pretrain/ckpt_epoch_90.pth
--trial born1 --model_path ./checkpoint --tb_path ./tensorboard --data_root ./mini
I use the model trained by distill and run it 3 times:
i get the accuracy on 1-shot:
i get the accuracy on 5-shot:
Although this is a very good result, it is still lower than the result of the paper:1-shot:64.82 5-shot:82.14
What is the model_s in 1-generation?
In 0-generation, model_s
is a randomly initialized model. model_t
is the vanilla model trained with CELoss.
In 1-generation, model_t
is the model_s
in 0-generation, but what is the model_s? Is it also randomly initialized?
Three times learning rate decay.
Thank you very much for your work. I want to know which epochs after performing three times learning rate decays on the tieredImageNet, CIFAR-FS and FC100 datasets?
does union training use all the examples at once?
@WangYueFt @HobbitLong e.g. when you train on the union of the meta-train set, do you use all the images?
i.e. consider mini-imagenet with 64 labels and 600 examples each. Do you effectively train a classification problem with 64 labels 64*600 total examples?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.