Git Product home page Git Product logo

faster-r-cnn-with-model-pretrained-on-visual-genome's People

Contributors

shilrley6 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

faster-r-cnn-with-model-pretrained-on-visual-genome's Issues

Problems decoding the predicted bboxes in TSV

Hello @shilrley6,
Congrats for your repo, it is quite useful.
Nonetheless, it seems I have found an error while decoding the boxes in the output of a TSV file. The image features are decoded correctly, but the bounding boxes are not.
I was testing the generate_tsv.py file, and even the predicted bboxes are correct, at the moment of encoding and storing the whole data is changed.

Have you encountered this issue? Any suggestion?
In my case, I can recompute another TSV and do not store the bboxes with this encoding....but it will take a lot of time to recompute it.

Question about convert_data.py

Thanks to your contribution. I'm testing the convert_data.py, but i don't know what is the meaning of the parameter --imgid_list. I create a .txt file and input 0 1 2. Though the code is runing successfully, but I'm still confused about the meaning of it. And If there is a method of setting the entire image folder in generate_tsv.py, because I have to set the image_ids mannually like [['image1', 0], ['image2', 1]].

undefined symbol in nms.py

Traceback (most recent call last):
File "generate_tsv.py", line 42, in
from model.roi_layers import nms
File "/home/caiwenjie/code/Faster-R-CNN-with-model-pretrained-on-Visual-Genome-master/lib/model/roi_layers/init.py",line 3, in
from .nms import nms
File "/home/caiwenjie/code/Faster-R-CNN-with-model-pretrained-on-Visual-Genome-master/lib/model/roi_layers/nms.py", line3, in
from model import _C
ImportError: /code/Faster-R-CNN-with-model-pretrained-on-Visual-Genome-master/lib/model/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs

High CPU Utilization

In generate_tsv.py (line 227-229):

    im_data_pt = torch.from_numpy(im_blob)
    im_data_pt = im_data_pt.permute(0, 3, 1, 2)
    im_info_pt = torch.from_numpy(im_info_np)

should be:

    im_data_pt = torch.from_numpy(im_blob).cuda()
    im_data_pt = im_data_pt.permute(0, 3, 1, 2)
    im_info_pt = torch.from_numpy(im_info_np).cuda()

About 'classes_dir' and 'class_agnostic'

Hi there, I am trying to use your code to extract features of flickr30k, and I noticed that there are 2 arguments named 'class_agnostic' and 'classes_dir', so I want to ask are they useful when I extract the features?
If so, what should I do with flickr30k?

KeyError during loading the model

Hi,

Thanks for your great job.
I meet the following problem when loading the model downloaded from this repo.

load checkpoint data/pretrained_model/faster_rcnn_res101_vg.pth
Traceback (most recent call last):
File "generate_tsv.py", line 455, in
generate_tsv(args.outfile, image_ids, args)
File "generate_tsv.py", line 428, in generate_tsv
classes, fasterRCNN = load_model(args)
File "generate_tsv.py", line 402, in load_model
fasterRCNN.load_state_dict(checkpoint['model'])
KeyError: 'model'

Could you please offer some advice? Thanks in advance.

the value of the generated numpy file is 0

Thank you so much for porting this Caffe-based model to Pytorch.
I am trying to generate a numpy file for my own data, however , the value of the generated numpy file are all 0,
May you teach me why is this happening?

Many thanks!

Segmentation fault(core dumped)

when I run CUDA_VISIBLE_DEVICES=1 python generate_tsv.py --net res101 --dataset vg --out test.tsv --cuda
It said:

  load checkpoint load_dir/faster_rcnn_res101_vg.pth
  load model successfully!
  load model load_dir/faster_rcnn_res101_vg.pth
  Segmentation fault (core dumped)

why this happen? How to solve that?

CUDA10.0
python3.6
pytorch1.0
gcc 5.4

Adding an OpenSource License

Hi @shilrley6! I was hoping to make use of your code here in another project, however, this repository does not have a license file to guide us in how we may make use of this. Would it be possibly to add an open source license, for example apache 2.0? Alternatively, are parts of this utilizing material from an existing OSS source that may be inherited?

attributes label

thanks for your excellent work and your sharing, I see the demo just has the labels in the object_vocat.txt, if I want to obtain the attributes labels, how shall I do, I changed the object_vocat.txt into attributes_vocat.txt, but it did not work, whether should I need the pretrained model on the attributes, if you have the one, can you share it?

Extract model weights for other tasks

Hello!

Great work! Was this model trained for classification? Not sure, but if it was trained for some task, then it should contain linear layers, pooling layers, which can be removed if I want to apply this model to some other task.

So, could you please provide some information about using this model weights for some other task? I would like to use it for image captioning, so that would be great to load its weights only without any task-specific layers.

Which layer of output is the 2048-dim embedding

Hi authors, thank you for the great work! I am trying to extract region features using FasterRCNN (Resnet101) trained on VG dataset. I was initially running another repo that required Caffe installation, but couldn't set it up after many days. So I am really glad to chance upon your repo.

The only difference is, the Caffe repo extracts at "pool5_flat" layer. Can i check which layer of the Resnet101 are the 2048-emb from?

Thanks!

TypeError: len() of a 0-d tensor

Traceback (most recent call last):
File "generate_tsv.py", line 476, in
generate_tsv(args.outfile, image_ids, args)
File "generate_tsv.py", line 459, in generate_tsv
writer.writerow(get_detections_from_im(fasterRCNN, classes, im_file, image_id, args))
File "generate_tsv.py", line 347, in get_detections_from_im
if len(keep_boxes) < MIN_BOXES:
File "anaconda2/envs/py3.6pytorch1.0/lib/python3.6/site-packages/torch/tensor.
py", line 411, in len
raise TypeError("len() of a 0-d tensor")
TypeError: len() of a 0-d tensor

when keep_boxes only have 1 value that is >0,
torch.squeeze(torch.nonzero(keep_boxes)) will generate a tensor with dim() = 0

Code to reproduce:
keep_boxes=torch.tensor([0,1,0,0])
len(torch.squeeze(torch.nonzero(keep_boxes)))

How to use multiple gpus?

I set the argument '--mGPUs' but the model still only used one gpu.
And I checked the 'generate_tsv.py' code and found variable '--mGPUs' is not used except setting.
Does anyone face the same problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.