Comments (6)
@lxtGH Yes, I agree with you. However, methods utilizing CLIP also encounter those novel classes during pre-training. Why can they be categorized as open vocabulary object detection? (I am not trying to stir controversy. I'm just genuinely curious and seeking clarification.) Thanks in advance!
Yes, CLIP itself is trained with many conncepts. However, it is adopted as pretrained weights for classification or initialization.
The difference lies during the fine-tuning, whether the novel labels and boxes can be used for the detector.
Object365 contains the novel box and label (defined in COCO), which is data leakage if using it for pre-training. Thus, this survey mainly focus on the setting proposed by OVR-CNN [1] and ViLD [2].
According to our experience, using this dataset for pretraining, any detectors can achieve SOTA results than any open vocabulary or zero-shot detector on COCO. So it is unfair.
Hope it helps!!
Reference:
[1] OVR-CNN: Open-Vocabulary Object Detection Using Captions, CVPR-2021
[2] ViLD: Open-vocabulary object detection via vision and language knowledge distillation, ICLR-2022
from awesome-open-vocabulary.
@JacobYuan7 Hi, Great questions! We have updated GLIP and GLIP-v2 in the next draft of our paper. Personally, I do not think it is strictly open-vocabulary object detection paper. GLIP use Object365 for pretraining. Object365 contains the novel classes that in OV-COCO novel classes.
from awesome-open-vocabulary.
@lxtGH Yes, I agree with you. However, methods utilizing CLIP also encounter those novel classes during pre-training. Why can they be categorized as open vocabulary object detection? (I am not trying to stir controversy. I'm just genuinely curious and seeking clarification.) Thanks in advance!
from awesome-open-vocabulary.
@lxtGH I've also been considering the potential of adding a section titled 'Open Vocabulary Relation Detection'. This is an area gaining growing research interest and could add valuable insights to this work. I've even submitted a simple pull request. However, I want to disclose that my perspective might be biased since I have worked on this topic. I'd greatly appreciate your thoughts on this.
from awesome-open-vocabulary.
@lxtGH I've also been considering the potential of adding a section titled 'Open Vocabulary Relation Detection'. This is an area gaining growing research interest and could add valuable insights to this work. I've even submitted a simple pull request. However, I want to disclose that my perspective might be biased since I have worked on this topic. I'd greatly appreciate your thoughts on this.
Yes, Thanks for your remind, We have added it into our internal version of this survey. The diversity of this direction is large and we miss these directions.
from awesome-open-vocabulary.
@lxtGH Many thanks for the clarification and the inclusion of papers in this field.
from awesome-open-vocabulary.
Related Issues (11)
- The paper may has been misclassified HOT 1
- update new papers HOT 1
- Do you have a Chinese version of your paper? HOT 1
- Add new papers HOT 1
- Incorrect ciation in the paper HOT 1
- Possible Citation for a Related Survey HOT 1
- Adding paper HOT 2
- About github repository star numbers
- Missing Paper Issues HOT 6
- Consider interactive segmentation as open-vocabulary segmentation HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from awesome-open-vocabulary.