Git Product home page Git Product logo

gpt4roi's People

Contributors

jshilong avatar mattmazzola avatar peizesun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gpt4roi's Issues

Demo error when attempting to send message: PredictBody validation error, missing required field

In the video of #41 I demonstrate an error when running the demo.

It seems to be missing required property in one of the events sent through gradio
Given the error occurs inside gradio-dev runtime I am unsure if is due to the app.py sending the wrong data, or if there is some issue inside the actual gradio-dev package

Running on local URL:  http://0.0.0.0:20012

To create a public link, set `share=True` in `launch()`.
Task exception was never retrieved
future: <Task finished name='6976h8jtnyr_7' coro=<Queue.process_events() done, defined at /workspaces/GPT4RoI/gradio-dev/gradio/queueing.py:342> exception=1 validation error for PredictBody
event_id
  Field required [type=missing, input_value={'data': [], 'event_data'...on_hash': '6976h8jtnyr'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.5/v/missing>
Traceback (most recent call last):
  File "/workspaces/GPT4RoI/gradio-dev/gradio/queueing.py", line 346, in process_events
    client_awake = await self.gather_event_data(event)
  File "/workspaces/GPT4RoI/gradio-dev/gradio/queueing.py", line 219, in gather_event_data
    data, client_awake = await self.get_message(event, timeout=receive_timeout)
  File "/workspaces/GPT4RoI/gradio-dev/gradio/queueing.py", line 448, in get_message
    return PredictBody(**data), True
  File "/home/vscode/miniconda3/envs/gpt4roi/lib/python3.9/site-packages/pydantic/main.py", line 164, in __init__
    __pydantic_self__.__pydantic_validator__.validate_python(data, self_instance=__pydantic_self__)
pydantic_core._pydantic_core.ValidationError: 1 validation error for PredictBody
event_id
  Field required [type=missing, input_value={'data': [], 'event_data'...on_hash': '6976h8jtnyr'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.5/v/missing

How to Evaluate in Terminal instead of web app

Excuse me ,I want to evaluate it on substantial data automatically,
I test it on several examples like this,however the res is not good,one region is ok,but two regions is bad:
image
Is my way sending the box&txt wrong?I want to handle it ,thanks!!!

权重问题

你好,请问llama2的权重可以和delta=weight融合吗

where to use

hello, in the inference.py you offered in #14, I see the multi-modal input tokens for LLM, it includes bbox token, but I can't find where you replace the bbox token or you use the image feature which got from clip and interpolate. Can you explain it for me? thank you.

GPU memory

How much GPU memory is required for inference?

weight file

image
The weight of LLaMA 7b does not exist in the provided webpage and the hugging face

VG Region Captioning Evaluation

Hi,

Thanks for open-sourcing this great work! We are developing some region captioning models and would like to perform a fair comparison with GPT4ROI. Is it possible to release the VG validation data you used for calculating the scores in Table 4? Thanks in advance!

issue with Demo: No response after drawing bounding box and entering text in the chatbox in the demo

Hi Authors,

Thank you for your great work.
While running the demo, I encountered an issue where, after loading an image and subsequently drawing a bounding box, there is no response upon entering text in the chatbox. This appears similar to the problem described in closed issue #9. I have ensured that the gradio_box is correctly set up and followed all provided instructions. The same error is experienced when executing app_box.py in gradio_box. I would really appreciate some help.

Thank you.

ValueError: The following `model_kwargs` are not used by the model: ['images']

Hi @jshilong great work considering ROI for language models.

I am getting this error "ValueError: The following model_kwargs are not used by the model: ['images']" while trying the inference code. Probably because 'images' is not considered as a parameter to the model.generate function.

with torch.amp.autocast(device_type='cuda'):
output_ids = self.model.generate(
input_ids,
images=image.unsqueeze(0).half().cuda(),
do_sample=True,
temperature=0.2,
max_new_tokens=1024,
stopping_criteria=[stopping_criteria])

Could you please confirm if you are using any specific version of the 'torch', 'llava', or 'transformers' library?
Thank you!

bug with gradio_box

Hello. Thank you for your excellent work.
I encountered some issues while using the "gradio-box". I install the gradio-box following the instruction successfully. The first uploaded image works well with the gradio-box.
1
But when I upload the second image after clicking the clear button, it can not show image correctly.
2
The browser console has provided the following error message
3
Could you please answer it.

Maximum Number of Regions

Thank you for your excellent work! Could you tell me the maximum number of regions supported by a single inference process.

ValueError: The following `model_kwargs` are not used by the model: ['images'] (note: typos in the generate arguments will also show up in this list)

image
Professor, it took me a few days to figure out my previous mistakes. But this error cannot be solved. Can you do me a favor? I have resolved all the environment versions, but they still don't work.

When I input promote, the model will report this error, and the following error will appear in the code.

Happy New Year's Day, Professor. I am indeed quite clumsy, could you please give me some guidance.
image

Question about Table 4

Hi, @jshilong @PeizeSun @ShoufaChen
I would like to ask some questions about "Table 4: Compariation of region caption ability on the validation dataset on Visual Genome".

  1. Do you divide the validation dataset for VG region caption task by yourselves?
    In the original VG dataset, it seems that there is no validation split.
    Could you please provide a link or a README to the validation dataset with me?

  2. Do you reproduce the result of GRiT?
    In GRiT's paper, it also seems that there is no related experimental result (e.g., CIDEr for the validation dataset for VG region captioning).
    Could you provide more details about this experiment?

Thank you in advance.

weight release

What an amazing job, and thanks for your contributions to the open source community, I'd like to try out some new ideas by using model weights, so do you have any plans to release weights anytime soon?

Retraining issue

Hi,

I appreciate the effort you put into your framework, but I encountered some confusion while attempting to retrain it. The guidance suggests using the original LLaMA weights for training, but I noticed in your script that the model name input is set as vicuna-7b: /mnt/petrelfs/share_data/zhangshilong/vicuna-7b/.

I attempted to use both the original LLaMA and LLaVA huggingface format (haven't applied your delta since it haven't been released yet), but it always resulted in this error:

  File "/gpt4roi/gpt4roi/train/train_mem.py", line 16, in <module>
    train()
  File "/gpt4roi/gpt4roi/train/train.py", line 641, in train
    model.initialize_vision_tokenizer(mm_use_im_start_end=model_args.mm_use_im_start_end,
  File "/gpt4roi/gpt4roi/models/spi_llava.py", line 295, in initialize_vision_tokenizer
    raise ValueError(
ValueError: Unexpected embed_tokens_weight shape. Pretrained: torch.Size([2, 4096]). Current: torch.Size([32006, 4096]). Numer of new tokens: 0.

I would appreciate your guidance in resolving the error and making the code runnable. Could you please provide me with the necessary steps or adjustments to address the issue?

n-round chat fails on the demo

The chatbot works well according to the given region when I ask the first question. But for the second ask it keeps processing and never generates a sentence. It may be an issue to be fixed.

About `solve_hallucination` setting in `det_llava.py`

Hi, although the Hallucination_questions and answer' is added into the sources' as shown in line 442, the sources was overwritten in line 449. Therefore, the coversations for solving hallucination are not actually added into the variable sources.

(Pdb) list
450 # print(copy_source)
451 sources = preprocess_multimodal(
452 copy.deepcopy([e['conversations'] for e in copy_source]),
453 self.multimodal_cfg, cur_token_len)
454
455 -> data_dict = preprocess(
456 sources,
457 self.tokenizer)
458 if isinstance(i, int):
459 data_dict = dict(input_ids=data_dict['input_ids'][0],
460 labels=data_dict['labels'][0])
(Pdb) sources
[[{'from': 'human', 'value': 'Can you describe the main features of this image for me?\nThe <im_start><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_patch><im_end> provides an overview of the picture. \n'}, {'from': 'gpt', 'value': "The image shows a baseball field with a group of people gathered, possibly during a small-town baseball game. There is a batter on deck, most likely a female softball player, stretching and practicing her batting stance while holding a baseball bat. She's positioned closer to the left side of the field. \n\nSeveral people stand around the field, some closer to the batter and others farther away, possibly teammates, coaches, or spectators. There are a few sports balls scattered around the field, with two located near the center and another one toward the right side.\n\nOverall, the scene depicts an active sports event, with people of various ages and roles involved in the game."}]]
(Pdb)

This project is using ViT-L or ViT-H?

Hi,

Great work! I have a question w.r.t the vision backbone used in paper.
In the paper, it says ViT-H, while in both code and checkpoint, it shows ViT-L.
Thanks!

Inquiry on evaluation scripts

Hi @jshilong, thanks for your great project!

I would like to reproduce your experimental results. Do you have a plan to release your evaluation scripts (e.g., Visual7W and VCR)? Thank you.

train_stage1.sh and train_stage2.sh

Hi, Currently the two bash scripts look similar. Can you please confirm the commands for the 1st and 2nd stage of training? I noticed that the data loading is being controlled from config. How exactly the model is properly frozen in two separate stages?

Issue in 2nd Stage of Pre-training

I faced the following error when I launched the 2nd stage of pre-training.
ValueError: loaded state dict contains a parameter group that doesn’t match the size of optimizer’s group
This error is likely due to number of trainable parameters are different in the 2nd stage than the 1st stage. How did you resolved this?

Demo is not use

I want to use the demo provided by the author to verify the results, but the demo provided on the author's webpage cannot be used. How can I solve this problem?

Dataset for stage 2

Hello, thank you very much for your excellent work. However, I have some doubts about the dataset, and I would appreciate it if you could clarify them for me. Where can I download the train.json file for visual_genome? Do I need to run EVA-02-DET myself to obtain the llava_150k_bbox_pred_results.pkl file?

MMCV-FULL

image
image
Whether I downloaded the official Python version that matches mmcv full or the one you provided, this error occurred. Could you please clarify

Question for app.py

Hello, thank you very much for your excellent contribution. But I encountered some issues while using the "app. py" code. My Graph Box is all correct. However, after entering text and pressing Enter, the run function cannot be triggered, which means that the demo has no response. After debugging, we still haven't found the problem. Could you please answer it.

Pre-trained weights

Hello, the download link of the pre-trained weights of huggingface you opened is not available, can you update it? Or other download channels. Thank you very much!

Load weight error

Hi, Thanks for your excellent work.
Now I ran into an issue when I tried to load GPT4ROI weights to perform stage2 training and there was an error
”Error(s) in loading state_dict for SPILlavaMPTForCausalLM:
size mismatch for lm_head.weight: copying a param with shape torch.Size([32006, 4096]) from checkpoint, the shape in current model is torch.Size([32005, 4096]).“
How to solve this problem?
Looking forward to your reply!

mmcv-1.4.7 error

In installation, when running MMCV_WITH_OPS=1 pip install -e ., I got error as below:

    /home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/include/c10/util/hash.h:285:6: note:   template argument deduction/substitution failed:
    /home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/include/c10/util/hash.h: In substitution of ‘template<class T> decltype ((T::hash(o), size_t())) c10::_hash_detail::dispatch_hash(const T&) [with T = std::shared_ptr<torch::autograd::Node>]’:
    /home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/include/c10/util/hash.h:295:39:   required from ‘size_t c10::hash<T>::operator()(const T&) const [with T = std::shared_ptr<torch::autograd::Node>; size_t = long unsigned int]’
    /home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/include/c10/util/hash.h:354:24:   required from ‘size_t c10::_hash_detail::simple_get_hash(const T&) [with T = std::shared_ptr<torch::autograd::Node>; size_t = long unsigned int]’
    /home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/include/c10/util/hash.h:314:43:   required from ‘size_t c10::hash<std::tuple<_Tps ...> >::tuple_hash<0, Ts ...>::operator()(const std::tuple<_Args1 ...>&) const [with Ts = {const std::shared_ptr<torch::autograd::Node>&, const unsigned int&}; Types = {const std::shared_ptr<torch::autograd::Node>&, const unsigned int&}; size_t = long unsigned int]’
    /home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/include/c10/util/hash.h:307:39:   required from ‘size_t c10::hash<std::tuple<_Tps ...> >::tuple_hash<idx, Ts>::operator()(const std::tuple<_Args2 ...>&) const [with long unsigned int idx = 1; Ts = {const std::shared_ptr<torch::autograd::Node>&, const unsigned int&}; Types = {const std::shared_ptr<torch::autograd::Node>&, const unsigned int&}; size_t = long unsigned int]’
    /home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/include/c10/util/hash.h:319:56:   required from ‘size_t c10::hash<std::tuple<_Tps ...> >::operator()(const std::tuple<_Tps ...>&) const [with Types = {const std::shared_ptr<torch::autograd::Node>&, const unsigned int&}; size_t = long unsigned int]’
    /home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/include/c10/util/hash.h:368:50:   required from ‘size_t c10::get_hash(const Types& ...) [with Types = {std::shared_ptr<torch::autograd::Node>, unsigned int}; size_t = long unsigned int]’
    /home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/edge.h:53:54:   required from here
    /home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/include/c10/util/hash.h:285:51: error: ‘hash’ is not a member of ‘std::shared_ptr<torch::autograd::Node>’
      285 | auto dispatch_hash(const T& o) -> decltype(T::hash(o), size_t()) {
          |                                            ~~~~~~~^~~
    ninja: build stopped: subcommand failed.
    Traceback (most recent call last):
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2107, in _run_ninja_build
        subprocess.run(
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/subprocess.py", line 526, in run
        raise CalledProcessError(retcode, process.args,
    subprocess.CalledProcessError: Command '['ninja', '-v', '-j', '4']' returned non-zero exit status 1.
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "<string>", line 2, in <module>
      File "<pip-setuptools-caller>", line 34, in <module>
      File "/home/user/gpt4roi/mmcv-1.4.7/setup.py", line 391, in <module>
        setup(
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/__init__.py", line 104, in setup
        return distutils.core.setup(**attrs)
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup
        return run_commands(dist)
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
        dist.run_commands()
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
        self.run_command(cmd)
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
        super().run_command(command)
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
        cmd_obj.run()
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/command/develop.py", line 34, in run
        self.install_for_development()
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/command/develop.py", line 111, in install_for_development
        self.run_command('build_ext')
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
        self.distribution.run_command(command)
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
        super().run_command(command)
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
        cmd_obj.run()
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 91, in run
        _build_ext.run(self)
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
        self.build_extensions()
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 870, in build_extensions
        build_ext.build_extensions(self)
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 479, in build_extensions
        self._build_extensions_serial()
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 505, in _build_extensions_serial
        self.build_extension(ext)
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 252, in build_extension
        _build_ext.build_extension(self, ext)
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 560, in build_extension
        objects = self.compiler.compile(
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 683, in unix_wrap_ninja_compile
        _write_ninja_file_and_compile_objects(
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1783, in _write_ninja_file_and_compile_objects
        _run_ninja_build(
      File "/home/user/anaconda3/envs/gpt4roi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2123, in _run_ninja_build
        raise RuntimeError(message) from e
    RuntimeError: Error compiling objects for extension
    [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.

does anyone have the same problem and how should this be fix?

(gpt4roi) user@mdeep:~/gpt4roi/mmcv-1.4.7$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0

I am using A5000 GPU, and above is my cuda version,

Finetuning stage 2 from a checkpoint

Hi @jshilong, thanks again for releasing the code and the models!
I am trying to finetune the model from stage 2. Could you please share a stage 2 checkpoint.
I am getting a 'ValueError: Can't find a valid checkpoint at ./exp/stage2/checkpoint-0' when trying to start from the current weight directory as the starting point.

Appreciate your help!

gradio bug

I debug for almost three day...
and sadly the gradio_box still not install well
the version you don't specify.......
................

evaluation

In the paper, this code does have a demo, but did you develop evaluation script on dev set or some existing datasets?

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

Hello@jshilong, have you encountered this problem?

I have trained the model of both two stages. Then I merge the trained model with llama as you described.
When I load the merged model to do test, the errors in below occured.

Traceback (most recent call last):
  File "/hy/code/gpt4roi/train_net.py", line 326, in <module>
    launch(
  File "/hy/code/gpt4roi/detectron2/detectron2/engine/launch.py", line 84, in launch
    main_func(*args)
  File "/hy/code/gpt4roi/train_net.py", line 311, in main
    res = Trainer.test(cfg, model)
  File "/hy/code/gpt4roi/detectron2/detectron2/engine/defaults.py", line 617, in test
    results_i = inference_on_dataset(model, data_loader, evaluator)
  File "/hy/code/gpt4roi/detectron2/detectron2/evaluation/evaluator.py", line 158, in inference_on_dataset
    outputs = model(inputs)
  File "/workspace/conda_env/gpt4roi/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/hy/code/gpt4roi/gpt4roi.py", line 153, in get_output
    output_ids = self.model.generate(
  File "/workspace/conda_env/gpt4roi/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/workspace/conda_env/gpt4roi/lib/python3.10/site-packages/transformers/generation/utils.py", line 1485, in generate
    return self.sample(
  File "/workspace/conda_env/gpt4roi/lib/python3.10/site-packages/transformers/generation/utils.py", line 2562, in sample
    next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

叠加权重时超出python迭代范围

你好,在我合成llama-base和你的roi-delta权重时报错,可以告诉我是什么原因怎么解决吗? RecursionError: maximum recursion depth exceeded while calling a Python object

What is the structure of this vision_tower?

Hello, thank you for your contribution. I meet a question on line 66 of the file models/spi_llava.py
image_forward_outs = vision_tower(images,output_hidden_states=True)
What is the structure of this vision_tower?

Training time

Hi @jshilong, in the documentation, it's mentioned that GPT4RoI was trained on 8 A100 GPUs. Could you please provide insights into how much time it took for both stage-1 and stage-2? Having this information would be extremely helpful.

Thank you in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.