Git Product home page Git Product logo

ssd's Introduction

High quality, fast, modular reference implementation of SSD in PyTorch 1.0

This repository implements SSD (Single Shot MultiBox Detector). The implementation is heavily influenced by the projects ssd.pytorch, pytorch-ssd and maskrcnn-benchmark. This repository aims to be the code base for researches based on SSD.

Example SSD output (vgg_ssd300_voc0712).

Losses Learning rate Metrics
losses lr metric

Highlights

  • PyTorch 1.0: Support PyTorch 1.0 or higher.
  • Multi-GPU training and inference: We use DistributedDataParallel, you can train or test with arbitrary GPU(s), the training schema will change accordingly.
  • Modular: Add your own modules without pain. We abstract backbone,Detector, BoxHead, BoxPredictor, etc. You can replace every component with your own code without change the code base. For example, You can add EfficientNet as backbone, just add efficient_net.py (ALREADY ADDED) and register it, specific it in the config file, It's done!
  • CPU support for inference: runs on CPU in inference time.
  • Smooth and enjoyable training procedure: we save the state of model, optimizer, scheduler, training iter, you can stop your training and resume training exactly from the save point without change your training CMD.
  • Batched inference: can perform inference using multiple images per batch per GPU.
  • Evaluating during training: eval you model every eval_step to check performance improving or not.
  • Metrics Visualization: visualize metrics details in tensorboard, like AP, APl, APm and APs for COCO dataset or mAP and 20 categories' AP for VOC dataset.
  • Auto download: load pre-trained weights from URL and cache it.

Installation

Requirements

  1. Python3
  2. PyTorch 1.0 or higher
  3. yacs
  4. Vizer
  5. GCC >= 4.9
  6. OpenCV

Step-by-step installation

git clone https://github.com/lufficc/SSD.git
cd SSD
# Required packages: torch torchvision yacs tqdm opencv-python vizer
pip install -r requirements.txt

# Done! That's ALL! No BUILD! No bothering SETUP!

# It's recommended to install the latest release of torch and torchvision.

Train

Setting Up Datasets

Pascal VOC

For Pascal VOC dataset, make the folder structure like this:

VOC_ROOT
|__ VOC2007
    |_ JPEGImages
    |_ Annotations
    |_ ImageSets
    |_ SegmentationClass
|__ VOC2012
    |_ JPEGImages
    |_ Annotations
    |_ ImageSets
    |_ SegmentationClass
|__ ...

Where VOC_ROOT default is datasets folder in current project, you can create symlinks to datasets or export VOC_ROOT="/path/to/voc_root".

COCO

For COCO dataset, make the folder structure like this:

COCO_ROOT
|__ annotations
    |_ instances_valminusminival2014.json
    |_ instances_minival2014.json
    |_ instances_train2014.json
    |_ instances_val2014.json
    |_ ...
|__ train2014
    |_ <im-1-name>.jpg
    |_ ...
    |_ <im-N-name>.jpg
|__ val2014
    |_ <im-1-name>.jpg
    |_ ...
    |_ <im-N-name>.jpg
|__ ...

Where COCO_ROOT default is datasets folder in current project, you can create symlinks to datasets or export COCO_ROOT="/path/to/coco_root".

Single GPU training

# for example, train SSD300:
python train.py --config-file configs/vgg_ssd300_voc0712.yaml

Multi-GPU training

# for example, train SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --config-file configs/vgg_ssd300_voc0712.yaml SOLVER.WARMUP_FACTOR 0.03333 SOLVER.WARMUP_ITERS 1000

The configuration files that I provide assume that we are running on single GPU. When changing number of GPUs, hyper-parameter (lr, max_iter, ...) will also changed according to this paper: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour.

Evaluate

Single GPU evaluating

# for example, evaluate SSD300:
python test.py --config-file configs/vgg_ssd300_voc0712.yaml

Multi-GPU evaluating

# for example, evaluate SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS test.py --config-file configs/vgg_ssd300_voc0712.yaml

Demo

Predicting image in a folder is simple:

python demo.py --config-file configs/vgg_ssd300_voc0712.yaml --images_dir demo --ckpt https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd300_voc0712.pth

Then it will download and cache vgg_ssd300_voc0712.pth automatically and predicted images with boxes, scores and label names will saved to demo/result folder by default.

You will see a similar output:

(0001/0005) 004101.jpg: objects 01 | load 010ms | inference 033ms | FPS 31
(0002/0005) 003123.jpg: objects 05 | load 009ms | inference 019ms | FPS 53
(0003/0005) 000342.jpg: objects 02 | load 009ms | inference 019ms | FPS 51
(0004/0005) 008591.jpg: objects 02 | load 008ms | inference 020ms | FPS 50
(0005/0005) 000542.jpg: objects 01 | load 011ms | inference 019ms | FPS 53

MODEL ZOO

Origin Paper:

VOC2007 test coco test-dev2015
SSD300* 77.2 25.1
SSD512* 79.8 28.8

COCO:

Backbone Input Size box AP Model Size Download
VGG16 300 25.2 262MB model
VGG16 512 29.0 275MB model

PASCAL VOC:

Backbone Input Size mAP Model Size Download
VGG16 300 77.7 201MB model
VGG16 512 80.7 207MB model
Mobilenet V2 320 68.9 25.5MB model
Mobilenet V3 320 69.5 29.9MB model
EfficientNet-B3 300 73.9 97.1MB model

Develop Guide

If you want to add your custom components, please see DEVELOP_GUIDE.md for more details.

Troubleshooting

If you have issues running or compiling this code, we have compiled a list of common issues in TROUBLESHOOTING.md. If your issue is not present there, please feel free to open a new issue.

Citations

If you use this project in your research, please cite this project.

@misc{lufficc2018ssd,
    author = {Congcong Li},
    title = {{High quality, fast, modular reference implementation of SSD in PyTorch}},
    year = {2018},
    howpublished = {\url{https://github.com/lufficc/SSD}}
}

ssd's People

Contributors

alexey-gruzdev avatar banderlog avatar beibinli avatar chakkritte avatar huaizhengzhang avatar lufficc avatar priteshgohil avatar tkhe avatar zqpei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ssd's Issues

Lr:0.0000

I have checked ssd300_voc0712.yaml, but lr=0.000 during the train process.Why?
And I follow the Multi-GPU training setting and my gpus=2, but the training speed is half of single gpu. Why?

Runtime error: Not compiled with GPU support

hello,when I use the code to train my own datesets,and execute the command in readme step by step,and
I met this problem:
2019-01-17 20:52:48,754 SSD.trainer INFO: Iter: 004550, Lr: 0.00100, Cost: 223.93s, Eta: 6 days, 3:12:37, total_loss: 3.036, regression_loss: 0.752, classification_loss: 2.283
2019-01-17 20:56:37,353 SSD.trainer INFO: Iter: 004600, Lr: 0.00100, Cost: 225.57s, Eta: 6 days, 3:08:25, total_loss: 3.188, regression_loss: 1.202, classification_loss: 1.987
2019-01-17 21:00:23,136 SSD.trainer INFO: Iter: 004650, Lr: 0.00100, Cost: 222.75s, Eta: 6 days, 3:03:03, total_loss: 2.773, regression_loss: 0.871, classification_loss: 1.902
2019-01-17 21:04:11,656 SSD.trainer INFO: Iter: 004700, Lr: 0.00100, Cost: 225.49s, Eta: 6 days, 2:58:50, total_loss: 2.975, regression_loss: 0.941, classification_loss: 2.034
2019-01-17 21:08:00,204 SSD.trainer INFO: Iter: 004750, Lr: 0.00100, Cost: 225.52s, Eta: 6 days, 2:54:39, total_loss: 2.737, regression_loss: 0.806, classification_loss: 1.932
2019-01-17 21:11:48,767 SSD.trainer INFO: Iter: 004800, Lr: 0.00100, Cost: 225.54s, Eta: 6 days, 2:50:28, total_loss: 2.875, regression_loss: 1.041, classification_loss: 1.834
2019-01-17 21:15:37,322 SSD.trainer INFO: Iter: 004850, Lr: 0.00100, Cost: 225.53s, Eta: 6 days, 2:46:17, total_loss: 3.588, regression_loss: 1.296, classification_loss: 2.292
2019-01-17 21:19:25,896 SSD.trainer INFO: Iter: 004900, Lr: 0.00100, Cost: 225.55s, Eta: 6 days, 2:42:08, total_loss: 2.428, regression_loss: 0.619, classification_loss: 1.809
2019-01-17 21:23:11,709 SSD.trainer INFO: Iter: 004950, Lr: 0.00100, Cost: 222.78s, Eta: 6 days, 2:36:54, total_loss: 2.225, regression_loss: 0.690, classification_loss: 1.535
2019-01-17 21:27:00,831 SSD.trainer INFO: Iter: 005000, Lr: 0.00100, Cost: 226.09s, Eta: 6 days, 2:32:59, total_loss: 2.845, regression_loss: 1.028, classification_loss: 1.817
2019-01-17 21:27:00,913 SSD.trainer INFO: Saved checkpoint to output/ssd300_vgg_iteration_005000.pth
2019-01-17 21:27:00,914 SSD.inference INFO: Will evaluate 1 dataset(s):
2019-01-17 21:27:00,914 SSD.inference INFO: Evaluating voc_2007_test dataset(75 images):
2019-01-17 21:27:00,914 SSD.inference INFO: Progress on CUDA 0:
0%| | 0/75 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train_ssd.py", line 138, in
main()
File "train_ssd.py", line 129, in main
model = train(cfg, args)
File "train_ssd.py", line 71, in train
return do_train(cfg, model, train_loader, optimizer, scheduler, device, args)
File "/home/t/github/SSD/ssd/engine/trainer.py", line 113, in do_train
do_evaluation(cfg, model, cfg.OUTPUT_DIR, distributed=args.distributed)
File "/home/t/github/SSD/ssd/engine/inference.py", line 93, in do_evaluation
_evaluation(cfg, dataset_name, test_dataset, predictor, distributed, output_dir)
File "/home/t/github/SSD/ssd/engine/inference.py", line 62, in _evaluation
output = predictor.predict(image)
File "/home/t/github/SSD/ssd/modeling/predictor.py", line 27, in predict
results = self.post_processor(scores, boxes, width=width, height=height)
File "/home/t/github/SSD/ssd/modeling/post_processor.py", line 66, in call
keep = boxes_nms(boxes, probs, self.iou_threshold, self.max_per_class)
File "/home/t/github/SSD/ssd/utils/nms.py", line 18, in boxes_nms
keep = _nms(boxes, scores, nms_thresh)
RuntimeError: Not compiled with GPU support (nms at /home/t/github/SSD/ext/nms.h:22)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f88fb9dfcc5 in /home/t/anaconda3/envs/tf/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: nms(at::Tensor const&, at::Tensor const&, float) + 0xd4 (0x7f88f76ed274 in /home/t/github/SSD/ext/torch_extension.cpython-36m-x86_64-linux-gnu.so)
frame #2: + 0x13697 (0x7f88f76f8697 in /home/t/github/SSD/ext/torch_extension.cpython-36m-x86_64-linux-gnu.so)
frame #3: + 0x1380e (0x7f88f76f880e in /home/t/github/SSD/ext/torch_extension.cpython-36m-x86_64-linux-gnu.so)
frame #4: + 0x10a0a (0x7f88f76f5a0a in /home/t/github/SSD/ext/torch_extension.cpython-36m-x86_64-linux-gnu.so)

frame #50: __libc_start_main + 0xf0 (0x7f894e009830 in /lib/x86_64-linux-gnu/libc.so.6)

(tf) t@t-System-Product-Name:~/github/SSD$ python

How to train on 1024*1024 picture?

This project is the best SSD model !! But I have a task to detect small object , 512512 is not suitable. How can I change it to input 10241024 picture? Can somebody give me a configuration? Thanks so much ! Urgent!Wating online !

About the coco annotations

I have download the coco2014 annotations, it doesn has the "annotations/instances_minival2014.json" and " annotations/instances_valminusminival2014.json", did you make it by yourself ?
thks a lot

CPU/GPU usage influencing training speed

Hi,
I am trying to train on coco. I used dockerfile to build the image.

FROM nvcr.io/nvidia/pytorch:18.12.1-py3
# FROM pytorch/pytorch:nightly-devel-cuda10.0-cudnn7
RUN pip install tensorboardX yacs tqdm pillow 
RUN conda install -y opencv cython
RUN git clone https://github.com/cocodataset/cocoapi.git && cd cocoapi/PythonAPI && python setup.py build_ext install
RUN git clone https://github.com/pytorch/vision.git \
    && cd vision \
    && python setup.py install
COPY . /SSD
WORKDIR /SSD
RUN python /SSD/ext/build.py build_ext develop
CMD [ "bash" ]

But I experienced severe CPU usage (almost100%) meanwhile low GPU usage on several machine (20C40T cpu and v100 gpu / 4C8T cpu and rtx 2080 gpu), and the training is extremely slow ().
I tried to use conda install of pytorch and the same thing happens.
Meanwhile, other pytorch 1.0 repo (maskrcnn-benchmark) was fine using provided docker file.
Is there anyone experiencing the same problem as I do?

What the meaning of warm-up strategy?

Why is the parameter last_epoch used? The reason of alpha = self.last_epoch / self.warmup_iters?

class WarmupMultiStepLR(MultiStepLR):
    def __init__(self, optimizer, milestones, gamma=0.1, warmup_factor=1.0 / 3,
                 warmup_iters=500, last_epoch=-1):
        self.warmup_factor = warmup_factor
        self.warmup_iters = warmup_iters
        super().__init__(optimizer, milestones, gamma, last_epoch)

    def get_lr(self):
        lr = super().get_lr()
        if self.last_epoch < self.warmup_iters:
            alpha = self.last_epoch / self.warmup_iters
            warmup_factor = self.warmup_factor * (1 - alpha) + alpha
            return [l * warmup_factor for l in lr]
        return lr

About the model initilize

Hi
I find that before you resume/init_from pre_trained model , the SSD class has reset the paramerters.
But when I cancel the resume process, it will lead to errors like this ,how can I init the weights without the pretrained model ( either vgg_resuced.pth ot ):

File "train_ssd.py", line 139, in
main()
File "train_ssd.py", line 130, in main
model = train(cfg, args)
File "train_ssd.py", line 71, in train
return do_train(cfg, model, train_loader, optimizer, scheduler, device, args)
File "/home/fmming/test/SSD/SSD-master/ssd/engine/trainer.py", line 76, in do_train
loss_dict = model(images, targets=(boxes, labels))
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/fmming/test/SSD/SSD-master/ssd/modeling/ssd.py", line 86, in forward
regression_loss, classification_loss = self.criterion(confidences, locations, gt_labels, gt_boxes)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/fmming/test/SSD/SSD-master/ssd/modeling/multibox_loss.py", line 31, in forward
mask = box_utils.hard_negative_mining(loss, labels, self.neg_pos_ratio)
File "/home/fmming/test/SSD/SSD-master/ssd/utils/box_utils.py", line 123, in hard_negative_mining
_, indexes = loss.sort(dim=1, descending=True)
RuntimeError: merge_sort: failed to synchronize: an illegal memory access was encountered

In the test phase, 81 categories of COCO did not match!

你好,我在COCO2014数据集上迭代训练了400000个iter,最终AP结果的与https://github.com/lufficc/SSD#details 描述的接近,但是当我使用训练好的模型运行demo.py测试时,发现只有'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 这些类别可以准确匹配,之后的许多类别出现匹配错误: 如‘dog’-->'cat', 'zebra'-->'bear', ‘horse’-->'dog', 'sheep'-->horse, 我发现存在的规律是出错的类别idx普遍正确的idx超前了一个值。

不知道你训练COCO之后是否出现了这样的问题,如果需要的话我可以将测试出错的图像发送邮箱,因为觉得汉语能说的清楚,请见谅。
期待你的回复。

Inference Speed

It seems that inference speed is very slow with PostProcessor part.

Training speed is slow

hello, thanks for the amazing job!
i have a problem about training speed.
i trained voc dataset using 2 1080ti and the speed is about 0.75/iteration, in my tensorflow implementation, the training speed is about 0.45/iteration. Also, i hear some other projects can achive 0.3/iteration in pytorch.
Can you share your training speed?

COCO performance

@lufficc
First of all, thank you for the implementation. It's very helpful.
But have you trained SSD on COCO by yourself? Could you please provide the detail results of performance? Furthermore, it will be highly appreciated if you could share the pre-trained model as far as I am concerned.

Training hangs. Seems dead lock somewhere

My training always hang when training for about 10k iterations, so that I have never finished the training procedure. Does anyone get this situation?

Below is my screen output.
It doesn't pop out any error, just hangs...

2019-01-15 17:46:13,572 SSD.trainer INFO: Iter: 016500, Lr: 0.00100, Cost: 30.35s, Eta: 18:05:08, total_loss: 2.746, classification_loss: 1.881, regression_loss: 0.864
2019-01-15 17:46:44,709 SSD.trainer INFO: Iter: 016550, Lr: 0.00100, Cost: 30.74s, Eta: 18:04:35, total_loss: 3.110, classification_loss: 2.112, regression_loss: 0.998
2019-01-15 17:47:15,735 SSD.trainer INFO: Iter: 016600, Lr: 0.00100, Cost: 30.60s, Eta: 18:04:01, total_loss: 2.336, classification_loss: 1.702, regression_loss: 0.634
2019-01-15 17:47:46,991 SSD.trainer INFO: Iter: 016650, Lr: 0.00100, Cost: 30.91s, Eta: 18:03:28, total_loss: 2.972, classification_loss: 2.040, regression_loss: 0.932
2019-01-15 17:48:18,479 SSD.trainer INFO: Iter: 016700, Lr: 0.00100, Cost: 31.07s, Eta: 18:02:57, total_loss: 2.584, classification_loss: 1.810, regression_loss: 0.774
2019-01-15 17:48:49,426 SSD.trainer INFO: Iter: 016750, Lr: 0.00100, Cost: 30.55s, Eta: 18:02:22, total_loss: 2.723, classification_loss: 1.915, regression_loss: 0.807

Error during NMS Build

Hi

Thank you very much for the repository. I'm using gcc 7.3.0 for building NMS. Should that be ok?

I get the following output on my stderr
stderr.log

Also, I had to install Cython before building cocotools, perhaps, you could mention in the documentation.

about logger

你好,最近在看这个程序,对于新的DistributedDataParallel比较生疏,这个里面
image
在训练的循环里面,reduce_loss_dict(...)和save_to_disk = distributed_util.get_rank() == 0保存模型,对rank0和非rank0做了不同的执行语句,但是logger显示每次loss,时间时,没有。我看运行时候:
image
应该显示的是rank0 loss reduce后的结果,那么非rank0的logger显示是怎么抑制的呢?

代码在
image
这个函数里面做了rank0的判断,但是这里的logger是名为SSD的,而后面循环的logger叫做SSD.trainer,能否解答下呢?谢谢。

### 另外,程序哪里开始是在不同gpu上同时执行的呢?thks

nms get error

when I ran the eval section, it get error Segmentation fault (core dumped) in the nms part. I think it is the .so error. my gcc version is gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) and my cuda is 8.0

error in nms

hi,lufficc:
I have a problem, when I change the configuration for gpu , e.g. change cuda to cuda:2, I want to change the training in the third gpu
image
The following error happens:
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /home/xxx/lufficc_ssd_shifted_anchor/ext/cuda/nms.cu:103

And the command I use to start the training is
python train_ssd.py --config-file configs/ssd300_voc0712.yaml --save_step 5000 --eval_step 1 --resume output/ssd513_vgg_iteration_005000.pth

coco-test-dev-2015 performance

i submit results for ssd300_coco_trainval35k_AP22.9.pth model to coco-server.
Here are the results

note: I use my own non-max suppersion, which is slightly different from lufficc's version

How to submit to test-dev-2015:

  1. use detection server: https://competitions.codalab.org/competitions/5181

  2. chose test-dev2018 (bbox)


COCO-test-dev-2015 server
overall performance
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.255
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.435
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.263
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.067
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.270
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.415
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.236
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.345
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.359
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.098
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.391
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.567 
Done (t=334.90s) 

to compare the difference of my non-max suppression with lufficc, here are my results of the models:

local : COCO-test-dev-2014 (instances_minival2014.json, num_imagaes = 5k)
ssd300_coco_trainval35k_AP22.9.pth model

DONE (t=6.61s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.251
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.428
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.261
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.061
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.271
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.419
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.234
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.342
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.358
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.097
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.397
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.562


local : COCO-test-dev-2014 (instances_minival2014.json, num_imagaes = 5k)
ssd300_voc0712_mAP77.83.pth

metric_type = voc07
           #name   ap
       aeroplane   0.825236
         bicycle   0.844450
            bird   0.759660
            boat   0.710224
          bottle   0.527462
             bus   0.864337
             car   0.865986
             cat   0.874129
           chair   0.617937
             cow   0.827866
     diningtable   0.786153
             dog   0.851901
           horse   0.863020
       motorbike   0.851469
          person   0.802394
     pottedplant   0.507871
           sheep   0.768501
            sofa   0.792603
           train   0.870370
       tvmonitor   0.755360
---------------------------------
             mAP   0.778346


Match Bug

Hello,your match strategy is error,please check it again

What does CENTER(SIZE)_VARIANCE mean in defaults.py?

I don`t know these configure in default.py:

Hard negative mining

_C.MODEL.CENTER_VARIANCE = 0.1
_C.MODEL.SIZE_VARIANCE = 0.2

Theyr used in /ssd/util/box_utils.py when boxes invert into locations or locations invert into boxes.But I dont know why.

change MAX_PER_CLASS to 400 as official caffe code will slightly increase mAP(0.8025=>0.8063, 0.7783=>0.7798)

_C.TEST.MAX_PER_CLASS = 200
_C.TEST.MAX_PER_IMAGE = -1

I dont know these either and I cant find where they`r used in project.

Can anyone help?Thanks all the time.

trainning error on SSD-512

1.
When I try to train a model with input size 512, loss always be non/inf , I had tried to reduce the warmup.factor to 0.1 or change the learning rate to 0.00001, both of them seem not work.
It just work well when I use smaller batch size ( <= 8)

I'm confused about that. Batch Size should decide by GPU-memory ( I run the program on 8*Tesla v100 32G memory, I think it is enougt for training ), why it cause such an error in loss function,
2.
I think batch size should be related to the iteration step, but the iteration is independent to batch size, so the small batch size will lead to shorter trainning time

Do you have any suggestions about the 2 questions ?
Thks a lot

test speed

Hello, how about the detect speed, i have run the demo.py, but i can't reach the speed in paper.

configuration problem

hi, a question about the configuration file:
e.g. the ssd300 voc and ssd 512 voc file:
I found in 512, the anchor size is not the same as in the 300 file,
image
I think, for 300 and 512 file ,the difference is the input image size, the network model is the same.
so, the anchor size should not be changed which the anchor in the same layer, because the receptive field is the same. right?( 输入大小不同,但是网络结构相同,对于同一层的anchor,他们设置的size应该是不用改变的,但是看你300 和512的配置文件,anchor size是不一样的。我觉得anchor size 与感受野有关,所以不应该如你所写的变大才对)

Training error with batch size 64 (two gpus)

The loss is unstable, and the error comes after 430 iters.

2018-12-17 13:55:45,716 SSD.trainer INFO: Train dataset size: 16551
2018-12-17 13:55:45,716 SSD.trainer INFO: Start training
2018-12-17 13:55:53,054 SSD.trainer INFO: Iter: 000010, Lr: 0.00069, Cost: 6.79s, Eta: 11:18:23, Loss: 16.110, Regression Loss 2.962, Classification Loss: 13.149
2018-12-17 13:55:59,009 SSD.trainer INFO: Iter: 000020, Lr: 0.00072, Cost: 5.54s, Eta: 10:35:03, Loss: 14.744, Regression Loss 2.703, Classification Loss: 12.041
2018-12-17 13:56:05,192 SSD.trainer INFO: Iter: 000030, Lr: 0.00074, Cost: 5.78s, Eta: 10:29:42, Loss: 13.971, Regression Loss 2.775, Classification Loss: 11.196
2018-12-17 13:56:11,117 SSD.trainer INFO: Iter: 000040, Lr: 0.00077, Cost: 5.54s, Eta: 10:20:49, Loss: 13.053, Regression Loss 2.877, Classification Loss: 10.176
2018-12-17 13:56:17,044 SSD.trainer INFO: Iter: 000050, Lr: 0.00080, Cost: 5.54s, Eta: 10:14:58, Loss: 11.377, Regression Loss 2.694, Classification Loss: 8.683
2018-12-17 13:56:22,996 SSD.trainer INFO: Iter: 000060, Lr: 0.00082, Cost: 5.57s, Eta: 10:11:33, Loss: 12.235, Regression Loss 2.856, Classification Loss: 9.379
2018-12-17 13:56:28,939 SSD.trainer INFO: Iter: 000070, Lr: 0.00085, Cost: 5.56s, Eta: 10:08:50, Loss: 9.304, Regression Loss 2.722, Classification Loss: 6.582
2018-12-17 13:56:34,890 SSD.trainer INFO: Iter: 000080, Lr: 0.00088, Cost: 5.57s, Eta: 10:06:57, Loss: 9.608, Regression Loss 2.600, Classification Loss: 7.008
2018-12-17 13:56:40,899 SSD.trainer INFO: Iter: 000090, Lr: 0.00090, Cost: 5.63s, Eta: 10:06:10, Loss: 9.044, Regression Loss 2.633, Classification Loss: 6.411
2018-12-17 13:56:46,872 SSD.trainer INFO: Iter: 000100, Lr: 0.00093, Cost: 5.59s, Eta: 10:05:02, Loss: 10.493, Regression Loss 2.597, Classification Loss: 7.896
2018-12-17 13:56:52,839 SSD.trainer INFO: Iter: 000110, Lr: 0.00096, Cost: 5.58s, Eta: 10:04:02, Loss: 9.837, Regression Loss 2.504, Classification Loss: 7.333
2018-12-17 13:56:58,813 SSD.trainer INFO: Iter: 000120, Lr: 0.00098, Cost: 5.59s, Eta: 10:03:18, Loss: 8.993, Regression Loss 2.577, Classification Loss: 6.416
2018-12-17 13:57:04,785 SSD.trainer INFO: Iter: 000130, Lr: 0.00101, Cost: 5.58s, Eta: 10:02:36, Loss: 9.234, Regression Loss 2.366, Classification Loss: 6.868
2018-12-17 13:57:10,782 SSD.trainer INFO: Iter: 000140, Lr: 0.00104, Cost: 5.61s, Eta: 10:02:12, Loss: 9.572, Regression Loss 2.397, Classification Loss: 7.175
2018-12-17 13:57:16,768 SSD.trainer INFO: Iter: 000150, Lr: 0.00106, Cost: 5.60s, Eta: 10:01:47, Loss: 10.361, Regression Loss 2.455, Classification Loss: 7.906
2018-12-17 13:57:22,772 SSD.trainer INFO: Iter: 000160, Lr: 0.00109, Cost: 5.62s, Eta: 10:01:32, Loss: 11.323, Regression Loss 2.497, Classification Loss: 8.826
2018-12-17 13:57:28,794 SSD.trainer INFO: Iter: 000170, Lr: 0.00112, Cost: 5.63s, Eta: 10:01:19, Loss: 11.311, Regression Loss 2.368, Classification Loss: 8.942
2018-12-17 13:57:34,801 SSD.trainer INFO: Iter: 000180, Lr: 0.00114, Cost: 5.62s, Eta: 10:01:07, Loss: 14.360, Regression Loss 2.493, Classification Loss: 11.866
2018-12-17 13:57:40,815 SSD.trainer INFO: Iter: 000190, Lr: 0.00117, Cost: 5.62s, Eta: 10:00:55, Loss: 9.740, Regression Loss 2.547, Classification Loss: 7.192
2018-12-17 13:57:46,815 SSD.trainer INFO: Iter: 000200, Lr: 0.00120, Cost: 5.61s, Eta: 10:00:42, Loss: 12.304, Regression Loss 2.444, Classification Loss: 9.860
2018-12-17 13:57:53,002 SSD.trainer INFO: Iter: 000210, Lr: 0.00122, Cost: 5.76s, Eta: 10:01:10, Loss: 9.891, Regression Loss 2.465, Classification Loss: 7.427
2018-12-17 13:57:59,044 SSD.trainer INFO: Iter: 000220, Lr: 0.00125, Cost: 5.66s, Eta: 10:01:18, Loss: 10.401, Regression Loss 2.495, Classification Loss: 7.905
2018-12-17 13:58:05,060 SSD.trainer INFO: Iter: 000230, Lr: 0.00128, Cost: 5.63s, Eta: 10:01:06, Loss: 9.791, Regression Loss 2.253, Classification Loss: 7.538
2018-12-17 13:58:11,072 SSD.trainer INFO: Iter: 000240, Lr: 0.00130, Cost: 5.63s, Eta: 10:00:55, Loss: 9.441, Regression Loss 2.396, Classification Loss: 7.045
2018-12-17 13:58:17,158 SSD.trainer INFO: Iter: 000250, Lr: 0.00133, Cost: 5.68s, Eta: 10:00:57, Loss: 8.072, Regression Loss 2.440, Classification Loss: 5.632
2018-12-17 13:58:23,013 SSD.trainer INFO: Iter: 000260, Lr: 0.00136, Cost: 5.47s, Eta: 10:00:13, Loss: 8.662, Regression Loss 2.442, Classification Loss: 6.221
2018-12-17 13:58:29,099 SSD.trainer INFO: Iter: 000270, Lr: 0.00138, Cost: 5.70s, Eta: 10:00:20, Loss: 8.421, Regression Loss 2.250, Classification Loss: 6.171
2018-12-17 13:58:35,208 SSD.trainer INFO: Iter: 000280, Lr: 0.00141, Cost: 5.72s, Eta: 10:00:30, Loss: 8.425, Regression Loss 2.143, Classification Loss: 6.281
2018-12-17 13:58:41,339 SSD.trainer INFO: Iter: 000290, Lr: 0.00144, Cost: 5.73s, Eta: 10:00:41, Loss: 9.016, Regression Loss 2.448, Classification Loss: 6.568
2018-12-17 13:58:47,479 SSD.trainer INFO: Iter: 000300, Lr: 0.00146, Cost: 5.74s, Eta: 10:00:57, Loss: 11.354, Regression Loss 2.215, Classification Loss: 9.139
2018-12-17 13:58:53,697 SSD.trainer INFO: Iter: 000310, Lr: 0.00149, Cost: 5.81s, Eta: 10:01:23, Loss: 12.369, Regression Loss 2.147, Classification Loss: 10.221
2018-12-17 13:58:59,810 SSD.trainer INFO: Iter: 000320, Lr: 0.00152, Cost: 5.72s, Eta: 10:01:33, Loss: 10.004, Regression Loss 2.278, Classification Loss: 7.726
2018-12-17 13:59:05,849 SSD.trainer INFO: Iter: 000330, Lr: 0.00154, Cost: 5.65s, Eta: 10:01:26, Loss: 7.794, Regression Loss 2.384, Classification Loss: 5.411
2018-12-17 13:59:11,847 SSD.trainer INFO: Iter: 000340, Lr: 0.00157, Cost: 5.61s, Eta: 10:01:11, Loss: 8.697, Regression Loss 2.366, Classification Loss: 6.331
2018-12-17 13:59:17,999 SSD.trainer INFO: Iter: 000350, Lr: 0.00160, Cost: 5.75s, Eta: 10:01:21, Loss: 12.521, Regression Loss 2.570, Classification Loss: 9.951
2018-12-17 13:59:24,357 SSD.trainer INFO: Iter: 000360, Lr: 0.00162, Cost: 5.98s, Eta: 10:02:10, Loss: 12.485, Regression Loss 2.474, Classification Loss: 10.012
2018-12-17 13:59:30,369 SSD.trainer INFO: Iter: 000370, Lr: 0.00165, Cost: 5.63s, Eta: 10:01:55, Loss: 12.791, Regression Loss 2.641, Classification Loss: 10.150
2018-12-17 13:59:36,477 SSD.trainer INFO: Iter: 000380, Lr: 0.00168, Cost: 5.73s, Eta: 10:01:59, Loss: 11.360, Regression Loss 2.661, Classification Loss: 8.699
2018-12-17 13:59:42,585 SSD.trainer INFO: Iter: 000390, Lr: 0.00170, Cost: 5.72s, Eta: 10:01:59, Loss: 11.183, Regression Loss 2.592, Classification Loss: 8.591
2018-12-17 13:59:48,701 SSD.trainer INFO: Iter: 000400, Lr: 0.00173, Cost: 5.72s, Eta: 10:01:59, Loss: 10.166, Regression Loss 2.575, Classification Loss: 7.590
2018-12-17 13:59:54,813 SSD.trainer INFO: Iter: 000410, Lr: 0.00176, Cost: 5.72s, Eta: 10:02:02, Loss: 17.562, Regression Loss 2.554, Classification Loss: 15.008
2018-12-17 14:00:00,942 SSD.trainer INFO: Iter: 000420, Lr: 0.00178, Cost: 5.74s, Eta: 10:02:05, Loss: 10.339, Regression Loss 2.592, Classification Loss: 7.747
2018-12-17 14:00:07,075 SSD.trainer INFO: Iter: 000430, Lr: 0.00181, Cost: 5.75s, Eta: 10:02:10, Loss: 28.599, Regression Loss 9.237, Classification Loss: 19.362
Traceback (most recent call last):
  File "train_ssd.py", line 139, in <module>
    main()
  File "train_ssd.py", line 130, in main
    model = train(cfg, args)
  File "train_ssd.py", line 76, in train
    return do_train(cfg, model, train_loader, optimizer, scheduler, criterion, device, args)
  File "/home/ycg/workspace/SSD/ssd/engine/trainer.py", line 78, in do_train
    regression_loss, classification_loss = criterion(confidence, locations, labels, boxes)
  File "/home/ycg/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ycg/workspace/SSD/ssd/modeling/multibox_loss.py", line 31, in forward
    mask = box_utils.hard_negative_mining(loss, labels, self.neg_pos_ratio)
  File "/home/ycg/workspace/SSD/ssd/utils/box_utils.py", line 123, in hard_negative_mining
    _, indexes = loss.sort(dim=1, descending=True)
RuntimeError: merge_sort: failed to synchronize: an illegal memory access was encountered
Traceback (most recent call last):
  File "train_ssd.py", line 139, in <module>
    main()
  File "train_ssd.py", line 130, in main
    model = train(cfg, args)
  File "train_ssd.py", line 76, in train
    return do_train(cfg, model, train_loader, optimizer, scheduler, criterion, device, args)
  File "/home/ycg/workspace/SSD/ssd/engine/trainer.py", line 78, in do_train
    regression_loss, classification_loss = criterion(confidence, locations, labels, boxes)
  File "/home/ycg/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ycg/workspace/SSD/ssd/modeling/multibox_loss.py", line 31, in forward
    mask = box_utils.hard_negative_mining(loss, labels, self.neg_pos_ratio)
  File "/home/ycg/workspace/SSD/ssd/utils/box_utils.py", line 123, in hard_negative_mining
    _, indexes = loss.sort(dim=1, descending=True)
RuntimeError: merge_sort: failed to synchronize: an illegal memory access was encountered
terminate called without an active exception
terminate called without an active exception

IsADirectoryError: [Errno 21] Is a directory: 'configs'

Traceback (most recent call last):
File "train_ssd.py", line 138, in
main()
File "train_ssd.py", line 119, in main
cfg.merge_from_file(args.config_file)
File "/root/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 172, in merge_from_file
with open(cfg_filename, "r") as f:
IsADirectoryError: [Errno 21] Is a directory: 'configs'
i got the error

torch/extension.h not found when building

python build.py build_ext develop
running build_ext
building 'torch_extension' extension
gcc -pthread -B /home/marco/anaconda2/envs/SSD/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/marco/Documenti/github/SSD-1.0.1/ext -I/home/marco/anaconda2/envs/SSD/lib/python3.6/site-packages/torch/lib/include -I/home/marco/anaconda2/envs/SSD/lib/python3.6/site-packages/torch/lib/include/TH -I/home/marco/anaconda2/envs/SSD/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/marco/anaconda2/envs/SSD/include/python3.6m -c /home/marco/Documenti/github/SSD-1.0.1/ext/vision.cpp -o build/temp.linux-x86_64-3.6/home/marco/Documenti/github/SSD-1.0.1/ext/vision.o -DTORCH_EXTENSION_NAME=torch_extension -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/marco/Documenti/github/SSD-1.0.1/ext/nms.h:3,
from /home/marco/Documenti/github/SSD-1.0.1/ext/vision.cpp:2:
/home/marco/Documenti/github/SSD-1.0.1/ext/cpu/vision.h:3:10: fatal error: torch/extension.h: File or directory does not exist
#include <torch/extension.h>
^~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command 'gcc' failed with exit status 1

Cannot load pre-trained SSD512 model

Hi,

I can run the demo with the provided SSD300 model, but when using the provided SSD512 config files/weights (configs/ssd512_voc0712.yaml, ssd512_voc0712_mAP80.25.pth) am getting this error:

model.load(weights)
File "SSD/ssd/modeling/ssd.py", line 97, in load
self.load_state_dict(torch.load(model, map_location=lambda storage, loc: storage))
File "/anaconda/envs/maskRcnnB/lib/python3.5/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for SSD:
Unexpected key(s) in state_dict: "extras.8.weight", "extras.8.bias", "extras.9.weight", "extras.9.bias", "classification_headers.6.weight", "classification_headers.6.bias", "regression_headers.6.weight", "regression_headers.6.bias".
size mismatch for classification_headers.4.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([84]).
size mismatch for classification_headers.4.weight: copying a param with shape torch.Size([126, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([84, 256, 3, 3]).
size mismatch for regression_headers.4.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for regression_headers.4.weight: copying a param with shape torch.Size([24, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 256, 3, 3]).

Inference speed for batch size 1

Hey there,

Thank you for your amazing job ! But I was wondering, what is the inference performance for batch size 1 ? I trained SSD on my on datasets and i'm getting ~0.40s / image and it feels quite slow ... I also trained a Faster-RCNN and even with a resnext-152 as backbone I have similar/faster inference time.

Should batchsize be scaled by number of GPUs?

I noticed that iteration is scaled but batch size is not:

batch_sampler = torch.utils.data.sampler.BatchSampler(sampler=sampler, batch_size=cfg.SOLVER.BATCH_SIZE, drop_last=False)

Should this be:

batch_sampler = torch.utils.data.sampler.BatchSampler(sampler=sampler, batch_size=cfg.SOLVER.BATCH_SIZE*args.num_gpus, drop_last=False)

How to make my script run multi epoch?

Hi, I found my training script always stopped at the end of the first epoch.
My training script works in a manner of for epoch in range(MAX_EPOCH) instead of a sampler. I just want know how to make my training script continue run.

C1083: Cannot open include file: 'io.h': No such file or directory

(ssd) D:\ai\Anaconda3\envs\SSD\github\cocoapi\PythonAPI>python setup.py build_ext install
running build_ext
building 'pycocotools._mask' extension
D:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -ID:\ai\Anaconda3\envs\ssd\lib\site-packages\numpy\core\include -I../common -ID:\ai\Anaconda3\envs\ssd\include -ID:\ai\Anaconda3\envs\ssd\include /Tcpycocotools/_mask.c /Fobuild\temp.win-amd64-3.7\Release\pycocotools/_mask.obj
_mask.c
d:\ai\anaconda3\envs\ssd\include\pyconfig.h(59): fatal error C1083: Cannot open include file: 'io.h': No such file or directory
error: command 'D:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\amd64\cl.exe' failed with exit status 2

(ssd) D:\ai\Anaconda3\envs\SSD\github\cocoapi\PythonAPI>

coco training speed problem

hello, i trained coco on 2 gpus and find the speed getting slower during training(0.8s/iter for early iteration, and 1.9s/iter at the end of the training) i wonder if you encounter this problem

Request for coco trained model

Can you provide coco trained model (ssd300)? I want to use it to run the evaluation code and reproduce the results below:

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.229

Thanks a lot!

Train with Custom Dataset

Hi Li,

I've been trying to train a custom SSD but I'm running into some issues. I annotated some 1,200 images on only one class. I've used Rectlabel where the output is an XML file per each image file. I then create the same dir structure as VOC2007 (Annotations, JPEGImages, ImageSets) saving files trainval.txt, test.txt, val.txt and {class_name}_trainval.txt, ..., in ImageSets/Main. I then modified the configs/ssd300_voc0712.yaml to take NUM_CLASSES: 2 and modify classes_name in voc_dataset.py. (I've also tried the steps you outline here).

The dataset gets recognized but when it's going through the DataLoader (each image, boxes,labels) I get the following error:

2019-01-15 09:58:13,478 SSD.trainer INFO: Init from base net vgg16_reducedfc.pth
2019-01-15 09:58:13,580 SSD.trainer INFO: Train dataset size: 752
2019-01-15 09:58:13,580 SSD.trainer INFO: Start training
Traceback (most recent call last):
  File "train_ssd.py", line 139, in <module>
    main()
  File "train_ssd.py", line 130, in main
    model = train(cfg, args)
  File "train_ssd.py", line 71, in train
    return do_train(cfg, model, train_loader, optimizer, scheduler, device, args)
  File "/home/ldap/mariano.metallo/03_SSD_Classifier/SSD/ssd/engine/trainer.py", line 68, in do_train
    for iteration, (images, boxes, labels) in enumerate(data_loader):
  File "/home/ldap/mariano.metallo/anaconda3/envs/SSD/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 637, in __next__
    return self._process_next_batch(batch)
  File "/home/ldap/mariano.metallo/anaconda3/envs/SSD/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
IndexError: Traceback (most recent call last):
  File "/home/ldap/mariano.metallo/anaconda3/envs/SSD/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/ldap/mariano.metallo/anaconda3/envs/SSD/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/ldap/mariano.metallo/03_SSD_Classifier/SSD/ssd/data/datasets/your_dataset.py", line 37, in __getitem__
    image, boxes, labels = self.transform(image, boxes, labels)
  File "/home/ldap/mariano.metallo/03_SSD_Classifier/SSD/ssd/modeling/data_preprocessing.py", line 33, in __call__
    return self.augment(img, boxes, labels)
  File "/home/ldap/mariano.metallo/03_SSD_Classifier/SSD/ssd/transforms/transforms.py", line 55, in __call__
    img, boxes, labels = t(img, boxes, labels)
  File "/home/ldap/mariano.metallo/03_SSD_Classifier/SSD/ssd/transforms/transforms.py", line 347, in __call__
    boxes[:, :2] += (int(left), int(top))
IndexError: too many indices for array

I'm running CUDA 10.

Is there any other step that I'm missing? Thank you very much!

The cfg is incomplete?

ssd300_voc0712.yaml, ssd512_voc0712.yaml, ssd300_coco_trainval35k.yaml, some settings are not existing

train_transform = TrainAugmentation(cfg.INPUT.IMAGE_SIZE, cfg.INPUT.PIXEL_MEAN)
target_transform = MatchPrior(PriorBox(cfg)(), cfg.MODEL.CENTER_VARIANCE,
cfg.MODEL.SIZE_VARIANCE, cfg.MODEL.THRESHOLD)

No module named 'torch_extension'

When I run demo.py, I get the following error:

Traceback (most recent call last):
File "demo.py", line 9, in
from ssd.modeling.predictor import Predictor
File "/home/guo/workspace/Object_Detection/SSD/Pytorch_SSD/ssd/modeling/predictor.py", line 3, in
from ssd.modeling.post_processor import PostProcessor
File "/home/guo/workspace/Object_Detection/SSD/Pytorch_SSD/ssd/modeling/post_processor.py", line 3, in
from ssd.utils.nms import boxes_nms
File "/home/guo/workspace/Object_Detection/SSD/Pytorch_SSD/ssd/utils/nms.py", line 1, in
import torch_extension
ModuleNotFoundError: No module named 'torch_extension'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.