vila-lab / sre2l Goto Github PK
View Code? Open in Web Editor NEW(NeurIPS 2023 spotlight) Large-scale Dataset Distillation/Condensation, 50 IPC (Images Per Class) achieves the highest 60.8% on original ImageNet-1K val set.
(NeurIPS 2023 spotlight) Large-scale Dataset Distillation/Condensation, 50 IPC (Images Per Class) achieves the highest 60.8% on original ImageNet-1K val set.
Hi, I really think it's a great work!
However, I meet some problems when I try to reproduce your method.
I have successfully run the recover and relabel process. I generate the syn_data and the soft label (i.e. many files like batch_0.tar...). When I want to run the train.sh (I already change the pytorch source code following your instruction), it says that "Caught KeyError in DataLoader worker process 0
". I find it doesn't find the corresponding img_idx
in the img2batch_idx_list
(relabel/utils_fkd.py line143).
The error is following:
Epoch: 0
Traceback (most recent call last):
File "/export/home2/jiyuan/SRe2L/train/train_FKD.py", line 360, in
main()
File "/export/home2/jiyuan/SRe2L/train/train_FKD.py", line 179, in main
train(model, args, epoch)
File "/export/home2/jiyuan/SRe2L/train/train_FKD.py", line 219, in train
for batch_idx, batch_data in enumerate(args.train_loader):
File "/export/home2/jiyuan/anaconda3/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 634, in next
data = self._next_data()
File "/export/home2/jiyuan/anaconda3/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
return self._process_data(data)
File "/export/home2/jiyuan/anaconda3/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
data.reraise()
File "/export/home2/jiyuan/anaconda3/lib/python3.9/site-packages/torch/_utils.py", line 644, in reraise
raise exception
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/export/home2/jiyuan/anaconda3/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/export/home2/jiyuan/anaconda3/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 62, in fetch
mix_index, mix_lam, mix_bbox, soft_label = self.dataset.load_batch_config(possibly_batched_index[0])
File "/export/home2/jiyuan/SRe2L/train/../relabel/utils_fkd.py", line 143, in load_batch_config
batch_idx = self.img2batch_idx_list[self.epoch][img_idx]
KeyError: 7542
Could you help me figure it out? Hope for your feedback!
Thanks.
I am trying running train_FKD.py
with train.sh
, but I did not find the file rn18_bn0.01_[4K]_x_l2_x_tv.crop
for argument --train-dir
in the dataset link(https://mbzuaiac-my.sharepoint.com/personal/zeyuan_yin_mbzuai_ac_ae/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fzeyuan%5Fyin%5Fmbzuai%5Fac%5Fae%2FDocuments%2Fproject%5Fshare%2FSRe2L&ga=1). Can you show me where I can find it?
Thank you zeyuanyin.
Thank you so much for sharing this excellent work!
I notice that you have conducted experiments on CIFAR-100. Are the hyperparameter settings the same as Imagenet-1K or tiny-Imagenet?
Looking forward to your response!
Thanks a lot for your paper "Squeeze, Recover and Relabel: Dataset Condensation at ImageNet Scale From A New Perspective" it paper gave me great inspiration.
In this paper, a custom model “BN-VIT” is mentioned for the recovery process, however there is little detail in the paper about using this model generation dataset process.
I'm interested in the design of this model and would like to know why BN is used instead of LN and what are its benefits?
Of course, I would be very grateful if you could share this BN-VIT model so that I can use it to verify the quality of the generated dataset.
When I run the train code (as in train_FKD.py) as provided, it errors on the getitem method of ImageFolder_FKD_MIX not having the corresponding batch_config. I understand that this config is loaded via load_batch_config, which is the reason why _MapDatasetFetcher is modified in the first place. However, the docs do not provide an actual procedure except some code. How do I actually overwrite the original pytorch source code to become the new code provided? Do I need to run a locally compiled version of pytorch for this to work? Are there solutions that only requires a few additional lines for it to work? Thank you.
Your work is very impressive. After checking the data set you released, the data about tiny imagenet is only E50, tiny_rn18E50_[1K].Aug.zip. Would you like to release the data set about tiny imagenet E100? Thank you very much!
Hi, this is a good work. But I ran into some issues while running the code.
When I run the train code (as in train_FKD.py) as provided, it errors on the getitem method of ImageFolder_FKD_MIX not having the corresponding batch_config.
The detailed description is as follows
======= FKD: dataset info ======
path: /home/xxx/SRe2L/relabel/FKD_cutmix_fp16/
num img: 50000
batch size: 1024
max epoch: 300
================================
300
load data successfully
=> loading student model 'resnet18'
Epoch: 0
Traceback (most recent call last):
File "train_FKD.py", line 362, in
main()
File "train_FKD.py", line 181, in main
train(model, args, epoch)
File "train_FKD.py", line 221, in train
for batch_idx, batch_data in enumerate(args.train_loader):
File "/home/xxx/anaconda3/envs/py37xxx/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 517, in next
data = self._next_data()
File "/home/xxx/anaconda3/envs/py37xxx/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
return self._process_data(data)
File "/home/xxx/anaconda3/envs/py37xxx/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
data.reraise()
File "/home/xxx/anaconda3/envs/py37xxx/lib/python3.8/site-packages/torch/_utils.py", line 429, in reraise
raise self.exc_type(msg)
Original Traceback (most recent call last):
File "/home/xxx/anaconda3/envs/py37xxx/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "/home/xxx/anaconda3/envs/py37xxx/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/xxx/anaconda3/envs/py37xxx/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/xxx/SRe2L/validate/../relabel/utils_fkd.py", line 143, in getitem
raise ValueError('config is not loaded')
ValueError: config is not loaded
How can I change it? Thanks.
Thank you for your excellent work! Could you share your code about continual Learning based on GDumb?
Hi. Thank you for providing such wonderful work.
In this project, the synthesized samples for the ImageNet-1K dataset are provided for only ResNet-18.
So, I tried to generate the synthetic samples generated from ResNet50. But the runtime is too long so I cannot keep the generating work.
Do you have a plan that exports the synthetic samples generated from ResNet50?
For my research, the ResNet50 synthetic samples can be helpful so I want to be provided with those samples.
I'm looking forward to your positive reply.
Thank you!
Could you please upload the pretrained checkpoint on tiny-imagenet? Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.