Git Product home page Git Product logo

fewshot_detection's Introduction

Few-shot Object Detection via Feature Reweighting

Implementation for the paper:

Few-shot Object Detection via Feature Reweighting, ICCV 2019

Bingyi Kang*, Zhuang Liu*, Xin Wang, Fisher Yu, Jiashi Feng and Trevor Darrell (* equal contribution)

Our code is based on https://github.com/marvis/pytorch-yolo2 and developed with Python 2.7 & PyTorch 0.3.1.

Detection Examples (3-shot)

Sample novel class detection results with 3-shot training bounding boxes, on PASCAL VOC.

Model

The architecture of our proposed few-shot detection model. It consists of a meta feature extractor and a reweighting module. The feature extractor follows the one-stage detector architecture and directly regresses the objectness score (o), bounding box location (x, y, h, w) and classification score (c). The reweighting module is trained to map support samples of N classes to N reweighting vectors, each responsible for modulating the meta features to detect the objects from the corresponding class. A softmax based classification score normalization is imposed on the final output.

Abstract

Conventional training of a deep CNN based object detector demands a large number of bounding box annotations, which may be unavailable for rare categories. In this work we develop a few-shot object detector that can learn to detect novel objects from only a few annotated examples. Our proposed model leverages fully labeled base classes and quickly adapts to novel classes, using a meta feature learner and a reweighting module within a one-stage detection architecture. The feature learner extracts meta features that are generalizable to detect novel object classes, using training data from base classes with sufficient samples. The reweighting module transforms a few support examples from the novel classes to a global vector that indicates the importance or relevance of meta features for detecting the corresponding objects. These two modules, together with a detection prediction module, are trained end-to-end based on an episodic few-shot learning scheme and a carefully designed loss function. Through extensive experiments we demonstrate that our model outperforms well-established baselines by a large margin for few-shot object detection, on multiple datasets and settings. We also present analysis on various aspects of our proposed model, aiming to provide some inspiration for future few-shot detection works.

Training our model on VOC

  • $PROJ_ROOT : project root
  • $DATA_ROOT : dataset root

Prepare dataset

  • Get The Pascal VOC Data
cd $DATA_ROOT
wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar
wget https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar
tar xf VOCtrainval_11-May-2012.tar
tar xf VOCtrainval_06-Nov-2007.tar
tar xf VOCtest_06-Nov-2007.tar
  • Generate Labels for VOC
wget http://pjreddie.com/media/files/voc_label.py
python voc_label.py
cat 2007_train.txt 2007_val.txt 2012_*.txt > voc_train.txt
  • Generate per-class Labels for VOC (used for meta inpput)
cp $PROJ_ROOT/scripts/voc_label_1c.py $DATA_ROOT
cd $DATA_ROOT
python voc_label_1c.py
  • Generate few-shot image list To use our few-shot datasets
cd $PROJ_ROOT
python scripts/convert_fewlist.py 

You may want to generate new few-shot datasets Change the ''DROOT'' varibale in scripts/gen_fewlist.py to $DATA_ROOT

python scripts/gen_fewlist.py # might be different with ours

Base Training

  • Modify Cfg for Pascal Data Change the data/metayolo.data file
metayolo=1
metain_type=2
data=voc
neg = 1
rand = 0
novel = data/voc_novels.txt             // file contains novel splits
novelid = 0                             // which split to use
scale = 1
meta = data/voc_traindict_full.txt
train = $DATA_ROOT/voc_train.txt
valid = $DATA_ROOT/2007_test.txt
backup = backup/metayolo
gpus=1,2,3,4
  • Download Pretrained Convolutional Weights
wget http://pjreddie.com/media/files/darknet19_448.conv.23
  • Train The Model
python train_meta.py cfg/metayolo.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg darknet19_448.conv.23
  • Evaluate the Model
python valid_ensemble.py cfg/metayolo.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg path/toweightfile
python scripts/voc_eval.py results/path/to/comp4_det_test_

Few-shot Tuning

  • Modify Cfg for Pascal Data Change the data/metatune.data file
metayolo=1
metain_type=2
data=voc
tuning = 1
neg = 0
rand = 0
novel = data/voc_novels.txt                 
novelid = 0
max_epoch = 2000
repeat = 200
dynamic = 0
scale=1
train = $DATA_ROOT/voc_train.txt
meta = data/voc_traindict_bbox_5shot.txt
valid = $DATA_ROOT/2007_test.txt
backup = backup/metatune
gpus  = 1,2,3,4
  • Train The Model
python train_meta.py cfg/metatune.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg path/to/base/weightfile
  • Evaluate the Model
python valid_ensemble.py cfg/metatune.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg path/to/tuned/weightfile
python scripts/voc_eval.py results/path/to/comp4_det_test_

Citation

@inproceedings{kang2019few,
  title={Few-shot Object Detection via Feature Reweighting},
  author={Kang, Bingyi and Liu, Zhuang and Wang, Xin and Yu, Fisher and Feng, Jiashi and Darrell, Trevor},
  booktitle={ICCV},
  year={2019}
}

fewshot_detection's People

Contributors

bingykang avatar liuzhuang13 avatar manuelageiss avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fewshot_detection's Issues

Where is the query set?

hello, @bingykang ,

I am very interested in this work. I have run the voc_label_1c.py and got the support set for test. However, I wonder how I can get the query set for test. You have mentioned that you used 3 splits for meta training and test. I want to be consistent with you in terms of experimental settings. So can you help me to solve this problem? Thanks very much!

Best Regard
Yukuan Yang

RuntimeError: argument 1 (padding) must be tuple of int but got tuple of (float, float)

When I run

python train_meta.py cfg/metayolo.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg darknet19_448.conv.23

A runtime error occured

2019-10-18 23:20:07 epoch 0/353, processed 0 samples, lr 0.000033
Traceback (most recent call last):
  File "train_meta.py", line 325, in <module>    train(epoch)
  File "train_meta.py", line 218, in train    output = model(data, metax, mask)  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__    result = self.forward(*input, **kwargs)  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 73, in forward    outputs = self.parallel_apply(replicas, inputs, kwargs)  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 83, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 67, in parallel_apply
    raise output
  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 42, in _worker
    output = module(*input, **kwargs)
  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/home1/ubuntu/project/Fewshot_Detection/darknet_meta.py", line 199, in forward
    dynamic_weights = self.meta_forward(metax, mask)
  File "/home1/ubuntu/project/Fewshot_Detection/darknet_meta.py", line 122, in meta_forward
    metax = model(metax)
  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 282, in forward
    self.padding, self.dilation, self.groups)
  File "/home/ubuntu/home1/software/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 89, in conv2d
    torch.backends.cudnn.deterministic, torch.backends.cudnn.enabled)
RuntimeError: argument 1 (padding) must be tuple of int but got tuple of (float, float)

Do you know what's wrong? Thanks

Memory to train

Could you please let me know how much memory your model needs for training?
My GPU has 12GB but I fail to start training though I put nw = 0 or 1 and down batchsize of yolo to 16 or 8?

RuntimeError: The size of tensor a (13) must match the size of tensor b (70135) at non-singleton dimension 3

I am trying to implement the code in Google Colab. I am getting this error, I had a similar issue in cgf.py but I solved it.
Below is the output and error that I am getting after running train_meta.py
!python train_meta.py cfg/metayolo.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg darknet19_448.conv.23

/content/Fewshot_Detection/data/coco.names
('save_interval', 10)
['bird', 'bus', 'cow', 'motorbike', 'sofa']
('base_ids', [0, 1, 3, 4, 6, 7, 8, 10, 11, 12, 14, 15, 16, 18, 19])
logging to backup/metayolofix_novel0_neg1
('class_scale', 1)
layer filters size input output
0 conv 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32
1 max 2 x 2 / 2 416 x 416 x 32 -> 208 x 208 x 32
2 conv 64 3 x 3 / 1 208 x 208 x 32 -> 208 x 208 x 64
3 max 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64
4 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128
5 conv 64 1 x 1 / 1 104 x 104 x 128 -> 104 x 104 x 64
6 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128
7 max 2 x 2 / 2 104 x 104 x 128 -> 52 x 52 x 128
8 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256
9 conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128
10 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256
11 max 2 x 2 / 2 52 x 52 x 256 -> 26 x 26 x 256
12 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512
13 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256
14 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512
15 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256
16 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512
17 max 2 x 2 / 2 26 x 26 x 512 -> 13 x 13 x 512
18 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
19 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512
20 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
21 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512
22 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
23 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024
24 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024
25 route 16
26 conv 64 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 64
27 reorg / 2 26 x 26 x 64 -> 13 x 13 x 256
28 route 27 24
29 conv 1024 3 x 3 / 1 13 x 13 x1280 -> 13 x 13 x1024
30 dconv 1024 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x1024
31 conv 30 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 30
32 detection

layer filters size input output
0 conv 32 3 x 3 / 1 416 x 416 x 4 -> 416 x 416 x 32
1 max 2 x 2 / 2 416 x 416 x 32 -> 208 x 208 x 32
2 conv 64 3 x 3 / 1 208 x 208 x 32 -> 208 x 208 x 64
3 max 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64
4 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128
5 max 2 x 2 / 2 104 x 104 x 128 -> 52 x 52 x 128
6 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256
7 max 2 x 2 / 2 52 x 52 x 256 -> 26 x 26 x 256
8 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512
9 max 2 x 2 / 2 26 x 26 x 512 -> 13 x 13 x 512
10 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
11 max 2 x 2 / 2 13 x 13 x1024 -> 6 x 6 x1024
12 conv 1024 3 x 3 / 1 6 x 6 x1024 -> 6 x 6 x1024
13 glomax 6 x 6 / 1 6 x 6 x1024 -> 1 x 1 x1024
1 14554 80200 32
10
===> Number of samples (before filtring): 4952
===> Number of samples (after filtring): 4952
('num classes: ', 15)
factor: 3.0
===> Number of samples (before filtring): 14554
===> Number of samples (after filtring): 14554
('num classes: ', 15)
2020-07-03 08:55:33 epoch 0/177, processed 0 samples, lr 0.000033
/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py:1351: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
Traceback (most recent call last):
File "train_meta.py", line 328, in
train(epoch)
File "train_meta.py", line 223, in train
loss = region_loss(output, target)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/content/Fewshot_Detection/region_loss.py", line 294, in forward
pred_boxes[0] = x.data + grid_x
RuntimeError: The size of tensor a (13) must match the size of tensor b (70135) at non-singleton dimension 3

details about the base model and the fine-tuned

I use two gpus, the other configurations is the same as the author, why is my performance poor?

  • The following figure is the eval of 500 epochs for base model.
    500

  • The following figure is the eval of 5 epochs for the fine-tune.
    5

  • The following figure is the eval of 10 epochs for the fine-tune.
    10
    @bingykang

Has any one run detect.py ?

Hi,
I've trained the model following the instructions, after finetuning on novel classes, the AP results are as follows:

AP for aeroplane = 0.6602
AP for bicycle = 0.4766
AP for bird = 0.3573
AP for boat = 0.4778
AP for bottle = 0.3153
AP for bus = 0.2155
AP for car = 0.6892
AP for cat = 0.8144
AP for chair = 0.3652
AP for cow = 0.3928
AP for diningtable = 0.5538
AP for dog = 0.6887
AP for horse = 0.6958
AP for motorbike = 0.4364
AP for person = 0.6416
AP for pottedplant = 0.2839
AP for sheep = 0.5383
AP for sofa = 0.3712
AP for train = 0.7394
AP for tvmonitor = 0.6614
~~~~~~~~
Mean AP = 0.5187
Mean Base AP = 0.5734
Mean Novel AP = 0.3546

Then, I want to make predictions using pre-trained weights, but I found it in darknet_dynamic.cfg, classes=1. I modify classes=20 during inference in order to make predictions on VOC.
but the result is worse.
predictions

how can I use pre-trained weight to make the right predictions?

thanks:)

About the loading model

Traceback (most recent call last):
File "train_meta.py", line 87, in
model.load_weights(weightfile)
File "/home/zjp/Fewshot_Detection/darknet_meta.py", line 381, in load_weights
start = load_conv_bn(buf, start, model[0], model[1])
File "/home/zjp/Fewshot_Detection/cfg.py", line 461, in load_conv_bn
conv_model.weight.data.copy_(torch.from_numpy(buf[start:start+num_w])); start = start + num_w
RuntimeError: The size of tensor a (3) must match the size of tensor b (864) at non-singleton dimension 3

sorry, When I loaded your pretrained model darknet19_448.conv.23, It came out the above problem.
Thank you!

CUDA out of memory

I have finished trained the model, but when I tried to evaluate the model, it printed out 'CUDA out of memory'. I have 2 GPUs and 32GB. But I can't use 2GPUs to evaluate. It evaluated with 1GPU every time even though I have changed gpu to "0,1" in valid_ensemble.py. Do you have any suggestion or solution?

RuntimeError: CUDNN_STATUS_EXECUTION_FAILED

Sorry for troubling you. When I run python train_meta.py cfg/metayolo.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg darknet19_448.conv.23,a runtimeerror occured:
Traceback (most recent call last):
File "train_meta.py", line 325, in
train(epoch)
File "train_meta.py", line 218, in train
output = model(data, metax, mask)
File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/m/.local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 71, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/m/Fewshot_Detection-master/darknet_meta.py", line 199, in forward
dynamic_weights = self.meta_forward(metax, mask)
File "/home/m/Fewshot_Detection-master/darknet_meta.py", line 122, in meta_forward
metax = model(metax)
File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 282, in forward
self.padding, self.dilation, self.groups)
File "/home/m/.local/lib/python2.7/site-packages/torch/nn/functional.py", line 90, in conv2d
return f(input, weight, bias)
RuntimeError: CUDNN_STATUS_EXECUTION_FAILED

CUDA out of memory

I have finished trained the model, but when I tried to evaluate the model, it printed out 'CUDA out of memory'. I have 2 GPUs and 32GB. But I can't use 2GPUs to evaluate. It evaluated with 1GPU every time even though I have changed gpu to "0,1" in valid_ensemble.py. Do you have any suggestion or solution?

About training on your own data set

I raised this question in region.loss_py when I was training with my own data set. I don't know what flags and ratio stand for. I hope I can get some help from you,thank you
image

In the few-shot fine-tuning phase, As only k labeled bounding boxes are available for the novel classes, we also include k boxes for each base class.

paper:The second phase is few-shot fine-tuning. In this phase, we train the model on both base and novel classes. As only k labeled bounding boxes are available for the novel classes, to balance between samples from the base and novel classes, we also include k boxes for each base class. (3.2. Learning Scheme Section ).
but the code that Feature Extractor learner label is large data (from train = /home/bykang/voc/voc_train.txt) In the few-shot fine-tuning phase. Do you have any suggestion or solution? can you help me? thanks.

con't find data/metayolo.data file

I cound't find the metayolo.data file in data directory,but i found it in the cfg directory,should i copy this file from cfg to data?

Does the cuda have some require?

Excese me ,thanks for your code,butwhen i run the base train,i have 'a question like that'===> Number of samples (before filtring): 4952
===> Number of samples (after filtring): 4952
('num classes: ', 15)
factor: 3.0
THCudaCheck FAIL file=/pytorch/torch/lib/THC/THCGeneral.c line=70 error=38 : no CUDA-capable device is detected
Traceback (most recent call last):
File "train_meta.py", line 142, in
model = torch.nn.DataParallel(model).cuda()
File "/home/wangning/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 216, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/wangning/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 146, in _apply
module._apply(fn)
File "/home/wangning/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 146, in _apply
module._apply(fn)
File "/home/wangning/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 146, in _apply
module._apply(fn)
File "/home/wangning/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 146, in _apply
module._apply(fn)
File "/home/wangning/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 152, in _apply
param.data = fn(param.data)
File "/home/wangning/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 216, in
return self._apply(lambda t: t.cuda(device))
File "/home/wangning/anaconda3/envs/python27/lib/python2.7/site-packages/torch/_utils.py", line 69, in cuda
return new_type(self.size()).copy
(self, async)
File "/home/wangning/anaconda3/envs/python27/lib/python2.7/site-packages/torch/cuda/init.py", line 384, in _lazy_new
_lazy_init()
File "/home/wangning/anaconda3/envs/python27/lib/python2.7/site-packages/torch/cuda/init.py", line 142, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/torch/lib/THC/THCGeneral.c:70'

my cuda is 9.1,is it not fit the python2.7 and torch0.3.1?Looking forward to your reply.Thank you.

Training about COCO

I've converted the coco_dataset into the voc style, the flag txt in ImageSets for each class and rewrite the label and label_1c for coco which generates labels' txt.

I think it's not easy like this

Actually, the data split for coco is also released in folder "data". Just change the dataset config and number of classes, everything should be good.

Though you give the process_coco.py, it dosen't work, and i think it misses the flag txt for each class

And the batchsize setting:

batch=64
subdivisions=8

will fill with memory and raise the out of memory error.

Now I change the batch_size setting:

batch=8
subdivisions=8

even though the smallest batch_size:

batch=4
subdivisions=4

It can forward successfuly, but i met the same out of memory in backward:

THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
  File "train_meta.py", line 344, in <module>
    train(epoch)
  File "train_meta.py", line 242, in train
    loss.backward()
  File "/home/aringsan/anaconda2/envs/pytorch2/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/home/aringsan/anaconda2/envs/pytorch2/lib/python2.7/site-packages/torch/autograd/__init__.py", line 99, in backward
    variables, grad_variables, retain_graph)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:58

I want to know your coco_setting and your hardware, I use 4 Titan XP whose memory is 12G

Thanks

Hi, few shot tuning

Thank you very much for your code, which is very interesting. But I have a question,
This is the result of my base training
image
This is the result of my few shot tuning
image
The MAP of novel class is better , but the base class is worse, is there something wrong?

Issue training base model

I have been trying to train a base model for some time now.

I have had issues with the version of pytorch the code was built on. 0.3.1 would not work with CUDA versions past 8.0. But my GeGorce RTX 2080 would not work with CUDA versions below 9.0.

I managed to have the code base work with PyTorch 0.4.0 and 0.4.1, with CUDA 10.1.

I have two GPUs, each with 10986MB. I managed to have the base training run for many epochs, but then my whole machine would shut down all of the sudden, through the training. I suspect this is because of my RAM.

I did have to reduce the batch size and subdivisions, to get the training to start.

But this is all to say that I am not able to get a base model, and I am wondering if there is anyone who has a model to share?

I will commit my code for PyTorch >= 0.4.0 soon, on my fork, but it would be so nice to have weights I could use.

result is 0

What should weightfile be? I try to use 1 weight, for example 000080.weights, to evaluate the model. But the result of each class is 0. And I try to use all 8 weights by toweightfile. The error is "shape '[32, 3, 3, 3]' is invalid for input of size 85". I don't have any solution.

RuntimeError: The expanded size of the tensor (3) must match the existing size (864) at non-singleton dimension 3. Target sizes: [32, 3, 3, 3]. Tensor sizes: [864]

Sorry for troubling you. When I run train_meta.py and load weightfile, a runtimeerror occured:

logging to backup/metayolo_novel0_neg1
class_scale 1

RuntimeErrorTraceback (most recent call last)
in ()
14 region_loss = model.loss
15
---> 16 model.load_weights(weightfile)
17 model.print_network()

~/lkj项目/FSD_yolo/darknet_meta.py in load_weights(self, weightfile)
376 batch_normalize = int(block['batch_normalize'])
377 if batch_normalize:
--> 378 start = load_conv_bn(buf, start, model[0], model[1])
379 else:
380

~/lkj项目/FSD_yolo/cfg.py in load_conv_bn(buf, start, conv_model, bn_model)
453 bn_model.running_mean.copy_(torch.from_numpy(buf[start:start+num_b])); start = start + num_b
454 bn_model.running_var.copy_(torch.from_numpy(buf[start:start+num_b])); start = start + num_b
--> 455 conv_model.weight.data.copy_(torch.from_numpy(buf[start:start+num_w])); start = start + num_w
456 return start
457

RuntimeError: The expanded size of the tensor (3) must match the existing size (864) at non-singleton dimension 3. Target sizes: [32, 3, 3, 3]. Tensor sizes: [864]

Do you know what's wrong with this? Thank you so much.

Some questions about memory usage when training base model

When I trained the base model, I observed that the memory usage continued to rise and finally reached the maximum limit and the process was killed. My RAM is 250GB and I used four 16GB Tesla V100 GPU. I tried to decrease the batch size but it did not make any difference. I wonder how much memory you used during the base training and if there is something wrong I did. Thank you.

The model is not start training?

I try to start training following your instructions. However, after loading model, I see that it is here to stay and do nothing like this:
image
and the code is here:
image
Do you know this problem? Thank you so much!

Loss becomes nan after few batches

Selection_320

Hello, The number of proposals are reduced to zeros after a few iterations, and later the loss becomes nan. VOC dataset is used for training. Has anyone run into the same problem?

Problems for programing details.

Sorry for troubling you, but I don't know how to compute loss for meta-model and feature-extractor.

In my idea, we'll have predict vector with shape (B, N, 13, 13, A, (5+N)) after feature reweighting, where B is batch, N is classes and A is anchors. If so, should I split my ground truth to N vectors according to different classes and compute loss for each channel of N in predict vector?
And the second question is the loss for meta-model and feature-extractor is the same one?

I fondly anticipate your reply, thanks.

Questions for implement.

Hi, thanks for your sharing, but I have two questions.

  1. In training phase, we'll concat masks and images before input them to meta model. However, the mask information is come from ground truth label, and we won't have it in testing phase. So when we testing, the inputs for the meta model are totally different, how can we solve this or I have any misunderstanding?

  2. In testing phase, after we got N set of reweighting coefficients, how to know which coefficients in the N set should we use for the testing sample?

Hope you can help me to clarify this question, thanks.

Are experimental results averaged value?

I wonder if the results on paper are single value (from single few-sampled set) or averaged value.
If it is average, how many times did you iterate it with shuffle?

Best regards,

Base Training error

Sorry for troubling you.I follow your instruction and when I run train_meta.py, a runtimeerror occured as follow:
1

Could you tell me how to solve it?

Question about the denotations in experimental results

image

@bingykang Would you mind elaborating a little more on the results, in term of interpretation of them?
For instance in the second row what do "0.5:0.95", "S M L", and the "1 10 100" refer to? Sorry I couldn't seem to find related explanation of these notations in the paper. Let me know if I missed something. Thanks.

FileNotFoundError: [Errno 2] No such file or directory: 'backup/metayolo_novel0_neg1'

Anyone have a backup file in your program?
python train_meta.py cfg/metayolo.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg darknet19_448.conv.23 /home/myh/Documents/program/few-shot-learning/Fewshot_Detection-master/data/coco.names save_interval 10 ['bird', 'bus', 'cow', 'motorbike', 'sofa'] base_ids [0, 1, 3, 4, 6, 7, 8, 10, 11, 12, 14, 15, 16, 18, 19] logging to backup/metayolo_novel0_neg1 Traceback (most recent call last): File "train_meta.py", line 79, in <module> os.mkdir(backupdir) FileNotFoundError: [Errno 2] No such file or directory: 'backup/metayolo_novel0_neg1'

Dimension problem in test()

Hi, thanks for your sharing,
Because I want to evaluate training results each epoch, so I uncomment test(epoch) in the train_meta.py.
But I got the error like this.
1233
And I found the shape of output from model(data, metax, mask).data is (480,30,13,13) and the shape for target is (32,15,250) , so the index error happened. (I changed batch_size to 32)
Do you know how to solve this error? Thanks

Problem about loading weights in coco training

I've prepare the coco dataset (I think it's not easy, i've get each image2class_flag txt and change many things, i spend a long time modifing the label_voc/_1.py to fit coco dataset), it may be work.

But now i met problem just in loading weights
I run python train_meta.py cfg/metayolo.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg darknet19_448.conv.23, but get:

/mnt/Disk1/liangzh/code/Fewshot_Detection_coco/data/coco.names
('save_interval', 2)
['airplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'dining table', 'dog', 'horse', 'motorcycle', 'person', 'potted plant', 'sheep', 'couch', 'train', 'tv']
('base_ids', [7, 9, 10, 11, 12, 13, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 59, 61, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79])
logging to backup/metayolo_novel0_neg1
('class_scale', 1)
Traceback (most recent call last):
  File "train_meta.py", line 87, in <module>
    model.load_weights(weightfile)
  File "/mnt/Disk1/liangzh/code/Fewshot_Detection_coco/darknet_meta.py", line 378, in load_weights
    start = load_conv_bn(buf, start, model[0], model[1])
  File "/mnt/Disk1/liangzh/code/Fewshot_Detection_coco/cfg.py", line 461, in load_conv_bn
    conv_model.weight.data.copy_(torch.from_numpy(buf[start:start+num_w])); start = start + num_w
RuntimeError: The size of tensor a (3) must match the size of tensor b (864) at non-singleton dimension 3

Thanks

out of memory for base training

I am reproducing the result using the instruction provided in the README file.

I am training base model with 1 GeForce GTX 1080 Ti with 12GB of memory. I modify batch_size=32.

when it runs about 20 epoches, cuda run time error occurs.

2020-05-29 13:14:00 epoch 20/177, processed 291080 samples, lr 0.000333
291112: nGT 77, recall 66, proposals 235, loss: x 2.222131, y 2.640358, w 2.185382, h 1.743314, conf 52.697956, cls 99.193832, total 160.682968
291144: nGT 77, recall 62, proposals 243, loss: x 1.478266, y 1.245305, w 2.208532, h 0.684470, conf 43.636837, cls 76.594849, total 125.848259
291176: nGT 70, recall 63, proposals 243, loss: x 1.873798, y 1.179447, w 1.839549, h 1.049649, conf 52.927620, cls 101.017876, total 159.887939
291208: nGT 75, recall 67, proposals 175, loss: x 1.820341, y 1.697263, w 1.052775, h 0.799489, conf 50.626663, cls 113.858749, total 169.855286
291240: nGT 105, recall 93, proposals 253, loss: x 3.521058, y 2.495901, w 3.214825, h 2.059216, conf 74.303398, cls 172.366638, total 257.961029
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
  File "train_meta.py", line 325, in <module>
    train(epoch)
  File "train_meta.py", line 223, in train
    loss.backward()
  File "/home/super/anaconda3/envs/torch0.3.1/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/home/super/anaconda3/envs/torch0.3.1/lib/python2.7/site-packages/torch/autograd/__init__.py", line 99, in backward
    variables, grad_variables, retain_graph)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:5

How can I solve this problem.
Thanks:)

Size mismatch, unable to train base model

Hi, sorry to bother you! I ran into the following error when trying to train the base model. I am using pytorch 0.3.1 and python 2.7. I attached the full log of stdout and the modified code to print out the size.

(featurereweight) quan@Bayes:~/few_shot/Fewshot_Detection$ python train_meta.py cfg/metayolo.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg darknet19_448.conv.23
/home/quan/few_shot/Fewshot_Detection/data/coco.names
('save_interval', 10)
['bird', 'bus', 'cow', 'motorbike', 'sofa']
('base_ids', [0, 1, 3, 4, 6, 7, 8, 10, 11, 12, 14, 15, 16, 18, 19])
logging to backup/metayolo_novel0_neg1
('class_scale', 1)
/home/quan/few_shot/Fewshot_Detection/cfg.py:455: UserWarning: src is not broadcastable to dst, but they have the same number of elements. Falling back to deprecated pointwise behavior.
conv_model.weight.data.copy_(torch.from_numpy(buf[start:start+num_w])); start = start + num_w
layer filters size input output
0 conv 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32
1 max 2 x 2 / 2 416 x 416 x 32 -> 208 x 208 x 32
2 conv 64 3 x 3 / 1 208 x 208 x 32 -> 208 x 208 x 64
3 max 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64
4 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128
5 conv 64 1 x 1 / 1 104 x 104 x 128 -> 104 x 104 x 64
6 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128
7 max 2 x 2 / 2 104 x 104 x 128 -> 52 x 52 x 128
8 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256
9 conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128
10 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256
11 max 2 x 2 / 2 52 x 52 x 256 -> 26 x 26 x 256
12 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512
13 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256
14 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512
15 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256
16 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512
17 max 2 x 2 / 2 26 x 26 x 512 -> 13 x 13 x 512
18 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
19 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512
20 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
21 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512
22 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
23 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024
24 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024
25 route 16
26 conv 64 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 64
27 reorg / 2 26 x 26 x 64 -> 13 x 13 x 256
28 route 27 24
29 conv 1024 3 x 3 / 1 13 x 13 x1280 -> 13 x 13 x1024
30 dconv 1024 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x1024
31 conv 30 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 30
32 detection

layer filters size input output
0 conv 32 3 x 3 / 1 416 x 416 x 4 -> 416 x 416 x 32
1 max 2 x 2 / 2 416 x 416 x 32 -> 208 x 208 x 32
2 conv 64 3 x 3 / 1 208 x 208 x 32 -> 208 x 208 x 64
3 max 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64
4 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128
5 max 2 x 2 / 2 104 x 104 x 128 -> 52 x 52 x 128
6 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256
7 max 2 x 2 / 2 52 x 52 x 256 -> 26 x 26 x 256
8 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512
9 max 2 x 2 / 2 26 x 26 x 512 -> 13 x 13 x 512
10 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
11 max 2 x 2 / 2 13 x 13 x1024 -> 6 x 6 x1024
12 conv 1024 3 x 3 / 1 6 x 6 x1024 -> 6 x 6 x1024
13 glomax 6 x 6 / 1 6 x 6 x1024 -> 1 x 1 x1024
1 14554 80200 64
10
===> Number of samples (before filtring): 4952
===> Number of samples (after filtring): 4952
('num classes: ', 15)
factor: 3.0
===> Number of samples (before filtring): 14554
===> Number of samples (after filtring): 14554
('num classes: ', 15)
2019-12-02 20:30:00 epoch 0/353, processed 0 samples, lr 0.000033
('nA', 5)
('nC', 1)
('nH', 13)
('nW', 13)
('bs', 64)
('cs', 15)
('output.shape', (1280L, 30L, 13L, 13L))
('cls.shape', (1280L, 5L, 6L, 13L, 13L))
('cls.shape', (1280L, 5L, 13L, 13L))
Traceback (most recent call last):
File "train_meta.py", line 325, in
train(epoch)
File "train_meta.py", line 221, in train
loss = region_loss(output, target)
File "/home/quan/miniconda3/envs/featurereweight/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/quan/few_shot/Fewshot_Detection/region_loss.py", line 277, in forward
cls = cls.view(bs, cs, nA * nC * nH * nW).transpose(1, 2).contiguous().view(bs * nA * nC * nH * nW, cs)
RuntimeError: invalid argument 2: size '[64 x 15 x 845]' is invalid for input with 1081600 elements at /opt/conda/conda-bld/pytorch_1518238581238/work/torch/lib/TH/THStorage.c:41

--- Please find below the sequences of print statement

    print('nA', nA)
    print('nC', nC)
    print('nH', nH)
    print('nW', nW)
    print('bs', bs)
    print('cs', cs)

    print('output.shape', output.shape)
    cls = output.view(output.size(0), nA, (5 + nC), nH, nW)

    print('cls.shape', cls.shape)
    cls = cls.index_select(2, Variable(torch.linspace(5, 5 + nC - 1, nC).long().cuda())).squeeze()

    print('cls.shape', cls.shape)
    cls = cls.view(bs, cs, nA * nC * nH * nW).transpose(1, 2).contiguous().view(bs * nA * nC * nH * nW, cs)

    print('cls.shape', cls.shape)

The classes used in base training

excuse me, '/home/bykang/voc/voc_train.tx' , Does this file contain 20 classes? It includes all the novel samples. Does it need to be 15 classes? all the 'train' samples should be the same with the 'meta' samples ? @bingykang

Training for 16:9 aspect ratio

I can see that the repository code has been trained on 448x448 and this works fine on Pascal VOC.

Now if i want to adapt the code to another dataset which has a ratio of 16:9, I can modify the input width, height to roughly -> 768, 448

Now because of this the final output layer of the reweighting network becomes 1x0x1024

layer filters size input output
0 conv 32 3 x 3 / 1 768 x 448 x 4 -> 768 x 448 x 32
1 max 2 x 2 / 2 768 x 448 x 32 -> 384 x 224 x 32
2 conv 64 3 x 3 / 1 384 x 224 x 32 -> 384 x 224 x 64
3 max 2 x 2 / 2 384 x 224 x 64 -> 192 x 112 x 64
4 conv 128 3 x 3 / 1 192 x 112 x 64 -> 192 x 112 x 128
5 max 2 x 2 / 2 192 x 112 x 128 -> 96 x 56 x 128
6 conv 256 3 x 3 / 1 96 x 56 x 128 -> 96 x 56 x 256
7 max 2 x 2 / 2 96 x 56 x 256 -> 48 x 28 x 256
8 conv 512 3 x 3 / 1 48 x 28 x 256 -> 48 x 28 x 512
9 max 2 x 2 / 2 48 x 28 x 512 -> 24 x 14 x 512
10 conv 1024 3 x 3 / 1 24 x 14 x 512 -> 24 x 14 x1024
11 max 2 x 2 / 2 24 x 14 x1024 -> 12 x 7 x1024
12 conv 1024 3 x 3 / 1 12 x 7 x1024 -> 12 x 7 x1024
13 glomax 12 x 12 / 1 12 x 7 x1024 -> 1 x 0 x1024

What is the right way to handle this change?

out of memory for fine tuning

I am reproducing the result using the instruction provided in the README file.

I was able to train the base model and obtain AP of 0.6862, which matches what the paper reports. However, when I tried to run the fine-tuning command, the process exits with an out of memory error for the backward pass.

I am training with 4 GeForce GTX 1080 Ti with roughly 12Gb of memory. Did you use GPUs with more memory or is something weird happening?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.