Git Product home page Git Product logo

rohit901 / cooperative-foundational-models Goto Github PK

View Code? Open in Web Editor NEW
45.0 6.0 2.0 6.46 MB

Official code for our paper "Enhancing Novel Object Detection via Cooperative Foundational Models"

Home Page: https://rohit901.github.io/coop-foundation-models/

License: MIT License

Python 100.00%
novel-objects object-detection open-set-object-detection open-vocabulary-detection zero-shot-object-detection computer-vision deep-learning pytorch

cooperative-foundational-models's People

Contributors

rohit901 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

jie311 lyndonchan

cooperative-foundational-models's Issues

Question for open vocab classes?

In this particular section of the paper "Synonym Averaged Embedding Generator (SAEG).
To obtain the text embeddings corresponding to each class
c ∈ C, we define T as set of all prompt templates, Si
as set of all synonyms for the i-th class Ci
in the dataset
and C as set of all classes, {C1, C2, . . . , C|C|}."

I would like to understand more about the set of classes C. Which set are you considering here to label the unknown boxes.

running inference on single image overloads gpu memory

Hi,

When trying to run inference on a test image using your script I get a "CUDA out of memory error". My image size is 640x480, and I have a GPU with 24GB memory. I'd appreciate it if you could help me resolve this. Thanks!

how to train

hello, i am wodering how can i train the model on my own datasets

Inference on Custom Class

Thank you for your interesting project.

Could you please help us understand how we can test your code for novel classes that are not in LVIS or Coco? For example, I would like to test your method for a set of classes related to lane markers. Could you please provide any examples that I can follow, so that I can test your work for custom classes on any images?

Kind Regards

(inference_single_image) RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

Thank you so much for updating "inference_single_image".
But, the following error occurred when using the script.

[02/21 14:08:53 fvcore.common.checkpoint]: [Checkpointer] Loading from weights/maskrcnn_v2/model_final.pth ... Downloading (…)ip_pytorch_model.bin: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.51G/3.51G [04:00<00:00, 14.6MB/s] Downloading tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20.6k/20.6k [00:00<00:00, 19.3MB/s] Downloading tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.42M/2.42M [00:01<00:00, 2.18MB/s] Downloading (…)cial_tokens_map.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.20k/2.20k [00:00<00:00, 2.71MB/s] Traceback (most recent call last): File "inference_single_image.py", line 103, in <module> inference_single_image(model, image_path, text_prompt_list, param_dict) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/app/altp/models/cooperative-foundational-models/evaluation.py", line 49, in inference_single_image _ = inference_gdino(model, inputs, text_prompt_list, param_dict) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/app/altp/models/cooperative-foundational-models/ground_dino_utils.py", line 77, in inference_gdino outputs = rcnn_model(inputs) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/detectron2/modeling/meta_arch/rcnn.py", line 146, in forward return self.inference(batched_inputs) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/detectron2/modeling/meta_arch/rcnn.py", line 200, in inference features = self.backbone(images.tensor) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/detectron2/modeling/backbone/fpn.py", line 126, in forward bottom_up_features = self.bottom_up(x) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/detectron2/modeling/backbone/resnet.py", line 445, in forward x = self.stem(x) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/detectron2/modeling/backbone/resnet.py", line 356, in forward x = self.conv1(x) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/detectron2/layers/wrappers.py", line 106, in forward x = F.conv2d( RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

and, my environmet info here :
`[02/21 14:08:52 detectron2]: Environment info:


sys.platform linux
Python 3.8.15 (default, Nov 24 2022, 15:19:38) [GCC 11.2.0]
numpy 1.23.5
detectron2 0.6 @/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/detectron2
Compiler GCC 7.3
CUDA compiler CUDA 11.3
detectron2 arch flags /home/user/miniconda/envs/cfm/lib/python3.8/site-packages/detectron2/_C.cpython-38-x86_64-linux-gnu.so; cannot find cuobjdump
DETECTRON2_ENV_MODULE
PyTorch 1.10.1 @/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/torch
PyTorch debug build False
GPU available Yes
GPU 0 NVIDIA H100 80GB HBM3 (arch=9.0)
Driver version 535.129.03
CUDA_HOME /usr/local/cuda
Pillow 8.3.2
torchvision 0.11.2 @/home/user/miniconda/envs/cfm/lib/python3.8/site-packages/torchvision
torchvision arch flags /home/user/miniconda/envs/cfm/lib/python3.8/site-packages/torchvision/_C.so; cannot find cuobjdump
fvcore 0.1.5.post20221221
iopath 0.1.9
cv2 4.7.0


PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX512
  • CUDA Runtime 11.3
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  • CuDNN 8.2
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.