Git Product home page Git Product logo

da-faster-rcnn-pytorch's People

Contributors

tiancity-nju avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

da-faster-rcnn-pytorch's Issues

RuntimeError: size mismatch, m1: [128 x 2048], m2: [4096 x 1024] at /opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/generic/THCTensorMathBlas.cu:249

Traceback (most recent call last):
File "da_trainval_net.py", line 398, in
tgt_im_data, tgt_im_info, tgt_gt_boxes, tgt_num_boxes, tgt_need_backprop)
File "/home/youlin/local/anaconda3/envs/da-faster-rcnn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/youlin/pytorch-proj/da-faster-rcnn-PyTorch/lib/model/da_faster_rcnn/faster_rcnn.py", line 187, in forward
instance_sigmoid, same_size_label = self.RCNN_instanceDA(pooled_feat, need_backprop)
File "/home/youlin/local/anaconda3/envs/da-faster-rcnn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/youlin/pytorch-proj/da-faster-rcnn-PyTorch/lib/model/da_faster_rcnn/DA.py", line 69, in forward
x=self.dc_drop1(self.dc_relu1(self.dc_ip1(x)))
File "/home/youlin/local/anaconda3/envs/da-faster-rcnn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/youlin/local/anaconda3/envs/da-faster-rcnn/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 55, in forward
return F.linear(input, self.weight, self.bias)
File "/home/youlin/local/anaconda3/envs/da-faster-rcnn/lib/python3.6/site-packages/torch/nn/functional.py", line 992, in linear
return torch.addmm(bias, input, weight.t())
RuntimeError: size mismatch, m1: [128 x 2048], m2: [4096 x 1024] at /opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/generic/THCTensorMathBlas.cu:249

Hi, i met this problem when i use the cityscape dataset.
the command is:
CUDA_VISIBLE_DEVICES=0 python da_trainval_net.py --dataset cityscape --net vgg16 --bs 1 --lr 2e-3 --lr_decay_step 6 --cuda --net res101

who can help me! Thanks!

cannot reprodce the results

I have downloaded the dataset as you uploaded and trained the baseline model and tranfer model,
however, the model with domain adaption only got AP of 0.2901 while the baseline model can obtain AP of 0.4622, which is quite different from the results in this repo (Our model could arrive mAP=30.71% in target domain which is high than baseline mAP=24.26%). So what may be the reasons?

problem about dataparallel

when i set batchsize=1, everything is ok, but when i turn to batchsize=2, i meet error
Traceback (most recent call last):
File "da_trainval_net.py", line 393, in
tgt_im_data, tgt_im_info, tgt_gt_boxes, tgt_num_boxes, tgt_need_backprop)
File "/home/wzy/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/wzy/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 115, in forward
return self.gather(outputs, self.output_device)
File "/home/wzy/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 127, in gather
return gather(outputs, output_device, dim=self.dim)
File "/home/wzy/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather
return gather_map(outputs)
File "/home/wzy/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File "/home/wzy/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 55, in gather_map
return Gather.apply(target_device, dim, *outputs)
File "/home/wzy/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 54, in forward
ctx.input_sizes = tuple(map(lambda i: i.size(ctx.dim), inputs))
File "/home/wzy/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 54, in
ctx.input_sizes = tuple(map(lambda i: i.size(ctx.dim), inputs))
RuntimeError: dimension specified as 0 but tensor has no dimensions

Environment setting

Hello, can this project be run in a Windows environment? Because the lab servers are all Windows environments, I really want to run this project. If possible, can you tell me how to set up the specific environment? If you can reply, I would be very grateful.

Train on pascal_voc dataset

I would like to ask if I want to train my pascal VOC dataset how should I do it, I see that the code only does DA processing on the cityscapes dataset? very thanks

ModuleNotFoundError: No module named 'model.utils.cython_bbox'

Traceback (most recent call last):
File "E:\domain_pro\da-faster-rcnn-PyTorch-master\trainval_net.py", line 27, in
from roi_data_layer.roidb import combined_roidb
File "E:\domain_pro\da-faster-rcnn-PyTorch-master\lib\roi_data_layer\roidb.py", line 9, in
from datasets.factory import get_imdb
File "E:\domain_pro\da-faster-rcnn-PyTorch-master\lib\datasets\factory.py", line 14, in
from datasets.pascal_voc import pascal_voc
File "E:\domain_pro\da-faster-rcnn-PyTorch-master\lib\datasets\pascal_voc.py", line 23, in
from .imdb import imdb
File "E:\domain_pro\da-faster-rcnn-PyTorch-master\lib\datasets\imdb.py", line 14, in
from model.utils.cython_bbox import bbox_overlaps
ModuleNotFoundError: No module named 'model.utils.cython_bbox'

Questions about experimental results

Hi, I have some questions regarding the performance of your code.
The original paper's performance ( using the Cityscapes training set as the source domain and the Foggy Cityscapes validation set as the target domain) is lower than your code.

original:
图片

yours:
final mAP=30.71% on baseline mAP=24.26%

I wonder why. Did you add some additional tricks or the implementation of original paper is buggy? Also, can your result be reproduced stably?
Wonderful work, by th way. Thanks a lot

number of test images

Hi, the provided dataset has 347 images for testing, while the paper says that there are 500 images for testing. I wonder why there is such a difference. Thanks.

What is the difference between proposals and da_proposals in the forward method of box_head.py ?

Hello,

I was wondering what the difference between proposals and da_proposals in the forward method of box_head.py

When debugging the code-base they appear to be identical

image

Is there any difference between the two?

In the box_head.py forward method, proposals are used in the feature extractor passed to the predictor to generate the class_logits and box_regrssion needed to get the loss_classifier and loss_box_reg from the loss_evaluator() ... then da_proposals are used in the same way as above to run the loss_evaluator once again to obtain da_ins_labels.

        if self.training:
            # Faster R-CNN subsamples during training the proposals with a fixed
            # positive / negative ratio
            with torch.no_grad():
                proposals = self.loss_evaluator.subsample(proposals, targets)

        # extract features that will be fed to the final classifier. The
        # feature_extractor generally corresponds to the pooler + heads
        x = self.feature_extractor(features, proposals)
        # final classifier that converts the features into predictions
        class_logits, box_regression = self.predictor(x)

        if not self.training:
            result = self.post_processor((class_logits, box_regression), proposals)
            return x, result, {}, x, None

        loss_classifier, loss_box_reg, _ = self.loss_evaluator(
            [class_logits], [box_regression]
        )

        if self.training:
            with torch.no_grad():
                da_proposals = self.loss_evaluator.subsample_for_da(proposals, targets)

        da_ins_feas = self.feature_extractor(features, da_proposals)
        class_logits, box_regression = self.predictor(da_ins_feas)
        lc2, lbr2, da_ins_labels = self.loss_evaluator(
            [class_logits], [box_regression]
        )

Unless I'm missing something could you not call the loss evaluator once and then return the da_ins_labels like this if they are identical anyway?

        loss_classifier, loss_box_reg, da_ins_labels = self.loss_evaluator(
            [class_logits], [box_regression]
        )

I feel I could be missing something here, it would be great if you could advise. Thanks! 😄

细节询问

你好,看你的代码发现target貌似用到了gt的信息,而原文中说是不用gt的,请问是否实现有误?还是说我的理解有误,请不吝赐教!

Dataset Structure

I download the dataset in https://www.cityscapes-dataset.com/ . The Image zip is leftImg8bit_trainvaltest.zip and the annotations file is gtFine_trainvaltest.zip. However the format of annotations is .json and how can I convert the json annotations to xml annotations before using the code?

RuntimeError: reduce failed to synchronize: device-side assert triggered

I am trying to perform the training from WIDERFACE dataset to FDDB dataset.

Traceback (most recent call last):
  File "da_trainval_net.py", line 433, in <module>
    tgt_im_data, tgt_im_info, tgt_gt_boxes, tgt_num_boxes, tgt_need_backprop)
  File "/export/livia/home/vision/bflorance/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/export/livia/home/vision/bflorance/da-faster-rcnn-PyTorch-wider/lib/model/da_faster_rcnn/faster_rcnn.py", line 190, in forward
    DA_ins_loss_cls = instance_loss(instance_sigmoid, same_size_label)
  File "/export/livia/home/vision/bflorance/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/export/livia/home/vision/bflorance/anaconda3/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 433, in forward
    reduce=self.reduce)
  File "/export/livia/home/vision/bflorance/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 1483, in binary_cross_entropy
    return torch._C._nn.binary_cross_entropy(input, target, weight, size_average, reduce)
RuntimeError: reduce failed to synchronize: device-side assert triggered

Upon debugging, I found that the input tensor for BCE loss should contain values that are between 0 and 1. However, the instance_sigmoid variable in faster_rcnn.py has nan for some data. (https://github.com/tiancity-NJU/da-faster-rcnn-PyTorch/blob/master/lib/model/da_faster_rcnn/faster_rcnn.py#L189)

The reason for the issue is because the base_feat tensor has all the values as zero. (https://github.com/tiancity-NJU/da-faster-rcnn-PyTorch/blob/master/lib/model/da_faster_rcnn/faster_rcnn.py#L58)

Any idea on why this might have happened and where it might have gone wrong?

I intend to reproduce SIM10k to cityscape, but car AP's AP is getting lower and lower as the number of training iterations increases.

I intend to reproduce SIM10k to cityscape, but car AP's AP is getting lower and lower as the number of training iterations increases.

I only trained the car category on the SIM10k dataset and adjusted the SIM10k image to the same size as the target domain. Then only test the car AP in the target domain. The several epoch results I trained are as follows:
epoch1: 0.4708
epoch2: 0.4444
epoch3: 0.3973
epoch4: 0.3398

how to run batch_demo.py?

I have successfully run the python file test.py and da_trainval_net.py,but failed to run batch_demo.py,I dont known the reason.
I met with such question:

RuntimeError: Error(s) in loading state_dict for vgg16: Unexpected key(s) in state_dict: "RCNN_imageDA.Conv1.weight", "RCNN_imageDA.Conv2.weight", "RCNN_instanceDA.dc_ip1.weight", "RCNN_instanceDA.dc_ip1.bias", "RCNN_instanceDA.dc_ip2.weight", "RCNN_instanceDA.dc_ip2.bias", "RCNN_instanceDA.clssifer.weight", "RCNN_instanceDA.clssifer.bias". While copying the parameter named "RCNN_cls_score.weight", whose dimensions in the model are torch.Size([21, 4096]) and whose dimensions in the checkpoint are torch.Size([9, 4096]). While copying the parameter named "RCNN_cls_score.bias", whose dimensions in the model are torch.Size([21]) and whose dimensions in the checkpoint are torch.Size([9]). While copying the parameter named "RCNN_bbox_pred.weight", whose dimensions in the model are torch.Size([84, 4096]) and whose dimensions in the checkpoint are torch.Size([36, 4096]). While copying the parameter named "RCNN_bbox_pred.bias", whose dimensions in the model are torch.Size([84]) and whose dimensions in the checkpoint are torch.Size([36]).

how can I observe the transform phenomenon?can someone tell me?

cudaCheckError() failed :

cudaCheckError() failed : no kernel image is available for execution on the device
why I got this problem? Please, Thank you !

Questions about RPN for target

I found that the source data used the same dataloader as the target data, so was the gt bbox of the target domain involved in ROI generation when the proposed region was generated during the RPN phase?In other words, what is the value of the tgt_bboxes?None?

hello friend

when I try training , there is a permissionError,

new: /data/ztc/adaptation/Experiment/da_model/vgg16/cityscape
Traceback (most recent call last):
File "da_trainval_net.py", line 229, in
os.makedirs(output_dir)
File "/home/lk/anaconda3/envs/envp/lib/python3.6/os.py", line 210, in makedirs
makedirs(head, mode, exist_ok)
File "/home/lk/anaconda3/envs/envp/lib/python3.6/os.py", line 210, in makedirs
makedirs(head, mode, exist_ok)
File "/home/lk/anaconda3/envs/envp/lib/python3.6/os.py", line 210, in makedirs
makedirs(head, mode, exist_ok)
[Previous line repeated 3 more times]
File "/home/lk/anaconda3/envs/envp/lib/python3.6/os.py", line 220, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/data'

I have made dir "cityspace" in "/da-faster-rcnn-Pytorch-master/data" and made vgg16_caffe.pth in "/da-faster-rcnn-Pytorch-master"

Running Question

Hello,
When I use pytorch 0.4.0, there is an error.

AttributeError: module 'torch.nn' has no attribute 'ModuleDict'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.