Git Product home page Git Product logo

henghuiding / rela Goto Github PK

View Code? Open in Web Editor NEW
654.0 654.0 16.0 2.1 MB

[CVPR2023 Highlight] GRES: Generalized Referring Expression Segmentation

Home Page: https://henghuiding.github.io/GRES/

License: MIT License

Python 74.84% Shell 0.26% C++ 2.48% Cuda 22.42%
cvpr2023 multimodal-learning referring-expression-comprehension referring-expression-segmentation referring-image-segmentation vision-language-transformer

rela's Introduction

Hi there 👋

Website GitHub Stars

  • 🔭 Researcher woking on Computer Vision and Artificial Intelligence
  • 🌎 Shanghai, China

Links:

Website https://henghuiding.github.io/

Google Scholar https://henghuiding.github.io/

rela's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rela's Issues

Question regarding Relationship Modeling

Screenshot from 2023-10-26 11-39-31

According to this paragraph, region-based queries are supervised by mini-map down-sampled from the ground truth. If I understand correctly, all the queries then have the same supervision. If so, how can these queries learn to correspond to different regions? Wouldn't they learn the same thing and correspond to the same region?

Can you kindly explain more about this?

Can you provide the code for Visualizing Model Results?

Thank a lot for your response to my previous question about evaluation result.
I was wondering if you have any code available for the visualization?
If so, could you kindly point me in the right direction or provide a link to the repository?
It will be helpful for my research.
Thanks in advance.

About the number of images in the validation set

Hi, I noticed that you updated the dataset on August 29th, but why are there more samples in the validation set in the new dataset?

And Which version of the dataset was used to achieve the results in your paper?

Question on evaluation result

I performe the given Inference code but get the different evaluation result from that in the paper:
My gIoU is 66.3407, cIoU is 63.0991, but in the paper they are respectively 63.60 and 62.42

Here follow my running code, file directory and the output. Is there anything wrong? Thank you.

!python train_net.py
--config-file configs/referring_swin_base.yaml
--num-gpus 1 --dist-url auto --eval-only
MODEL.WEIGHTS "/content/ReLA/gres_swin_base.pth"
OUTPUT_DIR "/content/ReLA"
image

Problems about RIA and RLA

I looked at your code and the code is clear, but I didn't find the RIA and RLA parts. Haven't you released this part of the code yet?

Question about Lang attn in RLA module?

Hi,
I have a question related to RLA module.

`

  lang_feat_att = self.lang_proj(lang_feat_att)
  lang_feat_att = self.RLA_lang_att(output, lang_feat_att.permute(1,0,2)) * F.sigmoid(self.lang_weight)
  output = output + lang_feat_att * self.rla_weight

`

It seems that RLA_lang_att does not contribute so much. I have tried to remove these lines of code and the result kept the same.
Moreover, with self.rla_weight=0.1 and only used for the first layer, the lang_feat_att may not affect to the output. However, in the paper, I saw that it improves ~1% in performance. Is there any mistake or I understood in a wrong way?

Question about finetune on custom dataset

Thanks for your great work! I wonder how to finetune on other custom dataset,I see in section 3.2 "We developed an online annotation tool to find images, select instances, write expressions, and verify the results",will the tool be open ?

Can you provide the configuration mentioned in the article?

hello, sir.
We have observed that the current configuration does not match the article, but the provided model. pth performance is an indicator in the article. Can you provide the original configuration or provide all performance for this configuration?

About the training time

According to your supplementary file, the model is trained for 150,000 iterations with a batch size of 24 on four 32G V100 GPUs.
How long does it take to complete a training?

Questions about training logs

Hello, I am very interested in your work, but I encountered some problems when reproducing it. Could you provide the training log of the model using Resnet50 as img-encoder?

model

Hi~
what is the "MODEL.WEIGHTS" in the training process?

Questions about train dataset and train metric?

Hello,thanks for your contribution.The train dataset setting in config yaml file is grefcoco_unc_train,and the article says grefcoco inherit some single expressions from refcoco.I notice that the code about dataset create a register dataset called grefcoco_unc_train_full.So my question is if I should change the config file for train dataset.In addition,I train the code and the best gIoU is about 53,which is lower than the metric in the article.I train on 4 2080Ti,and set batchsize to 8,and set max_iter to 900000 and multiply 6 steps about learning rate.I wound be grateful if you give me some solution or advice.Thank you.

About Batch_Size.

According to your supplementary file, the model is trained for 150,000 iterations with a batch size of 24 on four 32G V100 GPUs.
Do you mean the IMG_PER_BATCH is 24 or IMG_PER_BATCH is 6, and the total BS is 24?

Thanks.

Question on data split

I notice grefs(unc).json only contains 18495 images, which is different from 19994 in the paper, and maybe testA/B are not included in this json file? Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.