opendrivelab / drivelm Goto Github PK
View Code? Open in Web Editor NEWDriveLM: Driving with Graph Visual Question Answering
Home Page: https://opendrivelab.com/DriveLM/
License: Apache License 2.0
DriveLM: Driving with Graph Visual Question Answering
Home Page: https://opendrivelab.com/DriveLM/
License: Apache License 2.0
when running evaluation.py
I encountered a TypeError
related to the multiprocessing process.
evaluation start! Exception in thread Thread-3: Traceback (most recent call last): File "/Users/unizhuan/anaconda3/envs/llama_adapter_v2/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/Users/unizhuan/anaconda3/envs/llama_adapter_v2/lib/python3.8/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/Users/unizhuan/anaconda3/envs/llama_adapter_v2/lib/python3.8/multiprocessing/pool.py", line 576, in _handle_results task = get() File "/Users/unizhuan/anaconda3/envs/llama_adapter_v2/lib/python3.8/multiprocessing/connection.py", line 251, in recv return _ForkingPickler.loads(buf.getbuffer()) TypeError: __init__() takes 1 positional argument but 2 were given Process SpawnPoolWorker-24: Process SpawnPoolWorker-20: Process SpawnPoolWorker-21: Process SpawnPoolWorker-30: Process SpawnPoolWorker-8: Process SpawnPoolWorker-22: ....
wondering why and how to solve this🤔
Please check the following image. May be it should be in the fold CAM_BACK_RIGHT?
challenge/llama_adapter_v2_multimodal7b/data/nuscenes/samples/CAM_FRONT_RIGHT/n008-2018-09-18-13-10-39-0400__CAM_BACK_RIGHT__1537291002278113.jpg
Hello,
Thank you for this work, it is very interesting and helpful.
I have read carefully the paper you put on arxiv, and there is not detailed explanation on the type of rules you used for the dataset building (Particularly NuScenes ). Can you please provide me with some details or implementations?
Best,
Nassim.
Running the demo.py
to inference lead to stuck at the end, with python demo.py --llama_dir ./weights --checkpoint ./finetune_output/checkpoint-0.pth --data ../test_llama.json --output ../output.json --batch_size 8 --num_processes 8
.
The tqdm bar end with 100%|████████| 461/461 [1:28:23<00:00, 11.50s/it]
*2 (GPUs 4090*2), but the code stuck for a long time and didn't genarate the output.json. I tried to add a 'print' in demo.py after processes join (p.join()), but it doesn't show.
I had tried many times with different --batch_size and --num_processes but it still stuck there.
The test_llama.json
comes from the DriveLM dataset, And with the small sample test_llama.json
from the repo does't lead to the stuck.
Hope you can help!
finetune_data_config.yaml
reference 'v1_0_train_nus_llama.json' but how do I get this?
Following is my train script
bash exps/finetune.sh /home/junho/workspace/LLaMa2-weight ./1bcbffc43484332672092e0024a8699a6eb5f558161aebf98a7c6b1db67224d1_LORA-BIAS-7B.pth ./finetune_data_config.yaml output/path`
output log
[18:40:41.330088] Load checkpoint ./1bcbffc43484332672092e0024a8699a6eb5f558161aebf98a7c6b1db67224d1_LORA-BIAS-7B.pth
Traceback (most recent call last):
File "main_finetune.py", line 205, in <module>
[18:40:41.330920] read dataset config from ./finetune_data_config.yaml
main(args)
File "main_finetune.py", line 141, in main
dataset_train = FinetuneDataset(args.data_config, transform=transform_train,
File "/home/junho/workspace/DriveLM/challenge/llama_adapter_v2_multimodal7b/data/dataset.py", line 52, in __init__
meta_l = json.load(open(meta_path))
FileNotFoundError: [Errno 2] No such file or directory: 'v1_0_train_nus_llama.json'
[18:40:41.331437] DATASET CONFIG:
[18:40:41.331459] {'META': ['v1_0_train_nus_llama.json']}
Traceback (most recent call last):
File "main_finetune.py", line 205, in <module>
main(args)
File "main_finetune.py", line 141, in main
dataset_train = FinetuneDataset(args.data_config, transform=transform_train,
File "/home/junho/workspace/DriveLM/challenge/llama_adapter_v2_multimodal7b/data/dataset.py", line 52, in __init__
meta_l = json.load(open(meta_path))
FileNotFoundError: [Errno 2] No such file or directory: 'v1_0_train_nus_llama.json'
Traceback (most recent call last):
File "main_finetune.py", line 205, in <module>
main(args)
File "main_finetune.py", line 141, in main
dataset_train = FinetuneDataset(args.data_config, transform=transform_train,
File "/home/junho/workspace/DriveLM/challenge/llama_adapter_v2_multimodal7b/data/dataset.py", line 52, in __init__
meta_l = json.load(open(meta_path))
FileNotFoundError: [Errno 2] No such file or directory: 'v1_0_train_nus_llama.json'
Traceback (most recent call last):
File "main_finetune.py", line 205, in <module>
main(args)
File "main_finetune.py", line 141, in main
dataset_train = FinetuneDataset(args.data_config, transform=transform_train,
File "/home/junho/workspace/DriveLM/challenge/llama_adapter_v2_multimodal7b/data/dataset.py", line 52, in __init__
meta_l = json.load(open(meta_path))
FileNotFoundError: [Errno 2] No such file or directory: 'v1_0_train_nus_llama.json'
Traceback (most recent call last):
File "main_finetune.py", line 205, in <module>
main(args)
File "main_finetune.py", line 141, in main
dataset_train = FinetuneDataset(args.data_config, transform=transform_train,
File "/home/junho/workspace/DriveLM/challenge/llama_adapter_v2_multimodal7b/data/dataset.py", line 52, in __init__
meta_l = json.load(open(meta_path))
FileNotFoundError: [Errno 2] No such file or directory: 'v1_0_train_nus_llama.json'
Traceback (most recent call last):
File "main_finetune.py", line 205, in <module>
main(args)
File "main_finetune.py", line 141, in main
dataset_train = FinetuneDataset(args.data_config, transform=transform_train,
File "/home/junho/workspace/DriveLM/challenge/llama_adapter_v2_multimodal7b/data/dataset.py", line 52, in __init__
meta_l = json.load(open(meta_path))
FileNotFoundError: [Errno 2] No such file or directory: 'v1_0_train_nus_llama.json'
Traceback (most recent call last):
File "main_finetune.py", line 205, in <module>
main(args)
File "main_finetune.py", line 141, in main
dataset_train = FinetuneDataset(args.data_config, transform=transform_train,
File "/home/junho/workspace/DriveLM/challenge/llama_adapter_v2_multimodal7b/data/dataset.py", line 52, in __init__
meta_l = json.load(open(meta_path))
FileNotFoundError: [Errno 2] No such file or directory: 'v1_0_train_nus_llama.json'
Traceback (most recent call last):
File "main_finetune.py", line 205, in <module>
main(args)
File "main_finetune.py", line 141, in main
dataset_train = FinetuneDataset(args.data_config, transform=transform_train,
File "/home/junho/workspace/DriveLM/challenge/llama_adapter_v2_multimodal7b/data/dataset.py", line 52, in __init__
meta_l = json.load(open(meta_path))
FileNotFoundError: [Errno 2] No such file or directory: 'v1_0_train_nus_llama.json'
Could you tell us what is the dataset was used when training the parameters of visual block? And how much its data size?
It really confusing me that:
what‘s the difference between the datasets you mentioned in the results of baseline?
Could you give detailed information between these datasets?
Thanks!
按照README.md里的步骤,最后推理的结果部分如下:
{
"id": "f0f120e4d4b0441da90ec53b16ee169d_d9075c2a5f864a2b8abf41e703f4cf1c_3",
"question": "<image>\nIs <c1,CAM_FRONT_LEFT,231.5,472.1> a traffic sign or a road barrier?",
"gt_answer": "No.",
"answer": "Response\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n"
},
{
"id": "f0f120e4d4b0441da90ec53b16ee169d_d9075c2a5f864a2b8abf41e703f4cf1c_4",
"question": "<image>\nWhat actions could the ego vehicle take based on <c1,CAM_FRONT_LEFT,231.5,472.1>? Why take this action and what's the probability?",
"gt_answer": "The action is to keep going at the same speed, the reason is that there is no safety issue. The probability of this action is high.",
"answer": "///\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n Hinweis\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n"
},
{
"id": "f0f120e4d4b0441da90ec53b16ee169d_d9075c2a5f864a2b8abf41e703f4cf1c_5",
"question": "<image>\nWhat actions taken by the ego vehicle can lead to a collision with <c1,CAM_FRONT_LEFT,231.5,472.1>?",
"gt_answer": "No such action will lead to a collision.",
"answer": "Response\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n"
},
得到了毫无意义的answer,想问一下大佬有遇到过这种情况吗?可能是什么原因呢
Hi there, will the test/validation set include important object IDs (the "key_object_infos" dict)? If not, how will questions of subsequent steps (prediction/planning, etc.) be composed with the ID?
Hi organizers, thank you for your recent timely replies. We run into an issue when we run evaluation,py. Detailedly, in the line 27 "scores = self.chatgpt_eval.forward(answer, GT)", sometimes we will receive error from open-ai server "openai.error.ServiceUnavailableError: The server is overloaded or not ready yet". I guess this issue is mainly due to open-ai server. However, if we participants often run into this question, it will cost extra time and expense to use the open-ai api. We have tried to slow down the request rate (for example, sleep for 1 second after sending one remote service request), but similar issue still exitis. Is there any possible tips?
The value of 2.463 for the CIREr metric in Table 10 of the paper, is it converted? Is it 2.463 or 0.02463?
When i run pip install -r requrements.txt
, I get the following error
ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them. torch==2.0.0+cu117 from https://download.pytorch.org/whl/cu117/torch-2.0.0%2Bcu117-cp38-cp38-linux_x86_64.whl#sha256=c4dbc3f7f3eff6576473c3711d5d99adaaef733490b39de4970980d6edf4f0c2 (from -r requirements.txt (line 2)): Expected sha256 c4dbc3f7f3eff6576473c3711d5d99adaaef733490b39de4970980d6edf4f0c2 Got 24f2904c7d84dc64995c74a9130dee6aa83486212c5a09b703e82a083ef67278
I have also tried pip cache purge
and then run pip install --no-cache-dir -r requirements.txt
, but get the same error. How can i fixed the peoblem ?
Hi,
Very appreciate this impressive work.
I was wondering when will the DriveLM-CARLA dataset be released? It would be grateful to know the timeline.
Many Thanks.
Bests,
Yi
Hello, I checked your contest docs, it shows that the RAM required for reasoning and fine-tuning is 34/35Gb, but I don't have that high graphics card, can I reduce the batch_size to reduce the amount of video memory required?
Nice job! This work really sheds light on the future of autonomous driving using the power of LLMs or LMMs!
May I ask how do you build this dataset? Besides, when can we get the full dataset?
Hi!
I would like to extend my gratitude to your team for publishing this valuable demo dataset. May I inquire if there are plans to release a complete or more extensive version of the dataset in the near future? Do you have any plans to release this dataset before the upcoming CVPR conference? It would be extremely helpful for preparing related research work.
best
I am running srun python -u -m torch.distributed.launch --master_port=1112 --nproc_per_node=2 --nodes=1 --use_env
main_finetune.py --data_config "$CONFIG" --batch_size 4
--epochs 4 --warmup_epochs 1 --blr 10e-4 --weight_decay 0.02
--llama_path "$LLAMA_PATH"
--output_dir "$OUTPUT_DIR"
--pretrained_path "$PRETRAINED_PATH"
&>> "$OUTPUT_DIR"/output.log &
and my output is
[W socket.cpp:426] [c10d] The server socket has failed to listen on [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
[W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
[W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
[W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
[W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
[W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use).
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
[W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
[E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address.
Traceback (most recent call last):
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/bin/torchrun", line 8, in <module>
sys.exit(main())
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 241, in launch_agent
result = agent.run()
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/elastic/metrics/api.py", line 129, in wrapper
result = f(*args, **kwargs)
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/elastic/agent/server/api.py", line 723, in run
result = self._invoke_run(role)
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/elastic/agent/server/api.py", line 858, in _invoke_run
self._initialize_workers(self._worker_group)
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/elastic/metrics/api.py", line 129, in wrapper
result = f(*args, **kwargs)
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/elastic/agent/server/api.py", line 692, in _initialize_workers
self._rendezvous(worker_group)
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/elastic/metrics/api.py", line 129, in wrapper
result = f(*args, **kwargs)
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/elastic/agent/server/api.py", line 546, in _rendezvous
store, group_rank, group_world_size = spec.rdzv_handler.next_rendezvous()
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/elastic/rendezvous/static_tcp_rendezvous.py", line 55, in next_rendezvous
self._store = TCPStore( # type: ignore[call-arg]
RuntimeError: The server socket has failed to listen on any local network address. The server socket has failed to bind to [::]:38429 (errno: 98 - Address already in use). The server socket has failed to bind to 0.0.0.0:38429 (errno: 98 - Address already in use).
srun: error: gpu06: tasks 0-4,6-11: Exited with exit code 1
Traceback (most recent call last):
File "main_finetune.py", line 206, in <module>
main(args)
File "main_finetune.py", line 89, in main
misc.init_distributed_mode(args)
File "/mnt/data1/users/tianle/DriveLM/challenge/llama_adapter_v2_multimodal7b/util/misc.py", line 251, in init_distributed_mode
torch.distributed.barrier()
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 3313, in barrier
work = default_pg.barrier(opts=opts)
torch.distributed.DistBackendError: NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1207, internal error, NCCL version 2.14.3
ncclInternalError: Internal check failed.
Last error:
Bootstrap : no socket interface found
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 25585) of binary: /users/tianle/volatile/miniconda3/envs/llama_adapter_v2/bin/python
Traceback (most recent call last):
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/bin/torchrun", line 8, in <module>
sys.exit(main())
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/users/tianle/volatile/miniconda3/envs/llama_adapter_v2/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
main_finetune.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2024-03-16_21:19:21
host : gpu06.pri.barkla.alces.network
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 25585)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
srun: error: gpu06: task 5: Exited with exit code 1
Any idea what's happening here? I confirmed the port was not occupied by other jobs.
Is there any relations bewteen this competition and the GVQA methodology proposed in the DriveLM paper?
In this competition, we can replace the llama_adapter_v2 with other version llama_adapter or other LLM? I have same question in the visual encoder. Or do we just need to consider how to use fine-tuning techniques and other tricks to improve the performance of the model?
This may be a very basic question, sorry, it's my first time participating in this kind of competition.
How to get answers to the questions in the full graph of DriveLM-CARLA?
Description:
It has been observed that for the nuscenes
dataset, all camera images (whether front, back, left, or right) have a fixed resolution of (1600, 900). However, there are instances where labels in the annotations exceed these bounds.
Examples:
Key Frame with token 1537296390262404:
<moving> <car>
positioned to the <front right>
of the ego car.2343.0
is clearly out of the 1600x900 bounds.Key Frame with token 1531883988362460:
Concern:
There might be a misalignment or inconsistency in how the coordinates are being represented and annotated.
Thank you very much for your team's outstanding contribution to the field. However, I tried to fill in the Google form three times to apply for all LM annotations, but I didn't get any reply, and I could only get demo annotations. Can you help me? Thanks again.
Originally posted by piqiuni March 22, 2024
In the Table in Finetune, it says when Batch size = 4, the required VRAM is 34GB.
We are using RTX4090*2 for finetune, but we can only run with Batch size = 1, and the VRAM usage is about 20GB per card. Why it take so much VRAM?
Do you have any suggestions on the GPU usage? Should we try to get more GPUs for training? Maybe give me some detailed suggestions on GPU types and numbers?
Thanks for your suggestions! Forgive me for being a beginner~
It seems that C_loss and M_loss are the same variable.
what exactly do C_loss and M_loss mean? and why do we distinguish between them?
Thank you.
Hi, thanks for your great work! What's the purpose of the defination of key objects. It seems that the key objects is both in the questions and answers part. Do you expect the definition of Key objects to appear in pre prompt or do you need post-processing to replace it with a standardized format?
Thanks for the excellent work and auto-driving challenge. I have a bug when I run convert_data.py under "./challenge".
ValueError: 'Back up.' is not in list
I check the function and find that in line 7 of "challenge/convert_data.py". There is no "Back up" option. What should I do about it?
def rule_based1(question, answer):
rule = ["Going ahead.", "Turn right.", "Turn left.", "Stopped."]
question += f" Please select the correct answer from the following options: A. {rule[0]} B. {rule[1]} C. {rule[2]} D. {rule[3]}"
idx = rule.index(answer)
mapping = {0: "A", 1: "B", 2: "C", 3: "D"}
return {"Q": question, "A": mapping[idx]}
Hi, I am new in this area, can u provide a example .ipynb for drivelm dataloader? I dont know how to build the map from frame name in .json like "4a0798f849ca477ab18009c3a20b7df2" to filename like "n008-2018-05-21-11-06-59-0400__CAM_FRONT__1526915244512465".
Thank you
Thank you for the great work! I would like to ask a few questions about the baseline model (finetuned LLaMA Adapter V2).
Thank you!
Hi! For the challenge, when we do evaluation using the example output.json and tesdt_eval.json, the chatgpt evaluate part encounter an error
Exception in thread Thread-3: Traceback (most recent call last): File "/root/anaconda3/envs/llama_adapter_v2/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/root/anaconda3/envs/llama_adapter_v2/lib/python3.8/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/root/anaconda3/envs/llama_adapter_v2/lib/python3.8/multiprocessing/pool.py", line 576, in _handle_results task = get() File "/root/anaconda3/envs/llama_adapter_v2/lib/python3.8/multiprocessing/connection.py", line 251, in recv return _ForkingPickler.loads(buf.getbuffer()) TypeError: __init__() takes 1 positional argument but 2 were given
Could you help us solve this problem?
Thanks for your great work! May I ask when you plan to release the validation set? Or can you provide the tokens of the validation samples?
my model output wrong instance as <c3,CAM_FRONT,1100.0,500.0,50.0>. (3 coords)
import re, numpy as np
answer="<AI answer>"
answer_nums = re.findall(r'\d+\.\d+', answer)
print(answer_nums)
answer_nums = np.array([list(map(float, x.split()))[0] for x in answer_nums]).reshape(-1, 2)
print(answer_nums)
above example script described in evaluation.py
DriveLM/challenge/evaluation.py
Lines 62 to 66 in f3a760b
seems to extract coordinates.
if answer contains correct format (2 coords per object), they are parsed well.
answer = "There is a white truck to the front of the ego vehicle, a white sedan to the back of the ego vehicle, and a white sedan to the front of the ego vehicle. The IDs of these objects are <c1,CAM_FRONT,1000.0,500.0>, <c2,CAM_BACK,850.0,500.0>, and <c3,CAM_FRONT,1000.0,500.0>."
>> printed
['1000.0', '500.0', '850.0', '500.0', '1000.0', '500.0']
[[1000. 500.]
[ 850. 500.]
[1000. 500.]]
but if some object has 3 coords, the code fails.
answer = "There is a black sedan to the back of the ego vehicle, a black sedan to the front of the ego vehicle, a black sedan to the front of the ego vehicle, and a black sedan to the front of the ego vehicle. The IDs of these objects are <c1,CAM_BACK,1000.0,500.0>, <c2,CAM_FRONT,1000.0,500.0>, and <c3,CAM_FRONT,1100.0,500.0,50.0>."
>>
Traceback (most recent call last):
File "evaluation.py", line 134, in <module>
evaluation.set_graph(predict, GT)
File "evaluation.py", line 82, in set_graph
self.graph = self.match_result(answer, GT)
File "evaluation.py", line 68, in match_result
answer_nums = np.array([list(map(float, x.split()))[0] for x in answer_nums]).reshape(-1, 2)
ValueError: cannot reshape array of size 7 into shape (2)
I would fix evaluation to only extract two digits per < object >
rather than using regexp.
My point is, am I allowed to correct evaluation.py
to correct model output format?
if so, generally how far am I allowed to correct AI output?
what would be violations to the challenge rule?
The PretrainDataset class in dataset.py is right? it cannot used for load data from test_llama.json. I read the code, then found the data form of PretrainDataset class needed and FinetuneDataset is different.
Related to #69.
Line 316 in llama_adapter.py, set the default value to 32 lead to OutOfMemoryError in Inference Baseline.
With the output.json and test_eval.json in this repo, we got the wrong reply from ChatGPT
[{'role': 'system', 'content': 'an evaluator who rates my answer based on the correct answer'}, {'role': 'user', 'content': 'Rate my answer based on the correct answer out of 100, with higher scores indicating that the answer is closer to the correct answer, and you should be accurate to single digits like 62, 78, 41,etc. Output the number only
This is the correct answer: Firstly, notice <c3,CAM_FRONT_LEFT,1075.5,382.8>. The object is a traffic sign, so the ego vehicle should continue at the same speed. Secondly, notice <c2,CAM_FRONT,836.3,398.3>. The object is a traffic sign, so the ego vehicle should accelerate and continue ahead. Thirdly, notice <c1,CAM_BACK,991.7,603.0>. The object is stationary, so the ego vehicle should continue ahead at the same speed.
This is my answer: 1. The ego vehicle should notice the traffic light first, as it is the first object in the image. The traffic light is in the process of changing from red to green, indicating that the vehicle should proceed with caution.2. The ego vehicle should notice the truck next, as it is the second object in the image. The truck is stopped at the intersection, and the ego vehicle should be cautious when approaching the intersection to ensure it does not collide with the truck.3. The ego vehicle should notice the bus next, as it is the third object in the image. The bus is stopped at the intersection, and the ego vehicle should be cautious when approaching the intersection to ensure it does not collide with the bus.4. The ego vehicle should notice the traffic light next, as it is the fourth object in the image. The traffic light is in the process of changing from green to red, indicating that the vehicle should stop at the intersection.5. The ego vehicle should notice the truck next, as it is the fifth object in the image. The truck is stopped at the intersection, and the ego vehicle should be cautious when approaching the intersection to ensure it'}]
Sorry, but I cannot evaluate your answer based on the correct answer as the two responses are completely different.
Please help to figure this problem.
Should we add a "is completely different, give a '0' reply" in the content?
is it ok to use nuScenes mini dataset or it have to work with nuScenes full dataset?
Because full nuScenes data is about 500G and the downloading often crush at www.nuscenes.org/download.
i just want explore some mini set data in driveLM.
Or can you tell me how to get nuScenes dataset quickly?
I got the torch.cuda.OutOfMemoryError
in demo.py. But the params of --batch_size
and --num_processes
are the same with Finetune. (--batch_size 1 --num_processes 2 with 4090*2)
Is that caused by the total Vmemory usage (weights 13GB + checkpoint-3.pth 14GB + else) over the 24GB of 4090?
How can I run the Inference using the VRAM of two GPUs?
Or should we quantize the model to reduce the VRAM usage to run on a single 4090?
Can you give us advice on dealing with the problem?
Thanks a lot!
Thanks for the great work for AD LLM. The test server has opened, but it seems there is no new test split provided. So which split is used for test server, is test_llama.json
?
When I submit a Hugging Face model (submission file) to driving-with-language-2024, an error with the message "Invalid token" is returned.
Hi, appreciate the great work!
I noticed that the demo.py used for inference is only suitable for single GPU, which make it really slow to get an output.json file. How can I run the inference on multi-GPUs? Do you have scrips for multi-GPUs inference?
Running evaluation.py
with error:
File "evaluation.py", line 93, in match_result
answer_nums = np.array([list(map(float, x.split()))[0] for x in answer_nums]).reshape(-1, 2)
ValueError: cannot reshape array of size 13 into shape (2)
The accurate answer format should be [880.0, 500.0, 1000.0, 500.0, 1000.0, 500.0]
and reshape to [[ 880. 500.], [1000. 500.], [1000. 500.]]
But I got a [1055.5, 510.0, 1055.5, 510.0, 1055.5, 510.0, 1055.5, 510.0, 1055.5, 510.0, 1055.5, 510.0, 1055.5]
with 13 numbers in my answer.
I have no idea why it happens.
Please help
如标题
Hi DriveLM organizers, thank you for this hard and meaningful work! I'm wondering the recommended experimental settings (e.g., V100 (32G) * 4 will be enough?) for reimplementing this baseline.
https://github.com/OpenDriveLab/DriveLM/tree/main/challenge#setup-1
Please help me, Thank you.
Hi, it's great work for AD's development with LLM.
When I got test.json
by running exract_data.py
, I found it was not like the GVQA structure mentioned in paper. and the training doesn't contain the Context(C) QA. So I wanna know if it's the final version about training or will be updated ?
thanks!
Hi,
great work. I'd like to know which part of Nusence you have used. And What are the images of your demo data. Thanks
Use following script to generate perfect output
import json
with open('./output.json', 'r') as f:
output = json.load(f)
perfect_output = []
for sample in output:
new_sample={
'id':sample['id'],
'question':sample['question'],
'gt_answer':sample['gt_answer'],
'answer':sample['gt_answer'],
}
perfect_output.append(new_sample)
with open(f"./perfect_output.json", 'w') as f:
json.dump(perfect_output, f, indent=4)
Then run,
python evaluation.py --root_path1 ./perfect_output.json --root_path2 ./test_eval.json
And it gives me
accuracy: 1.0
chatgpt: 100.0
match: 100.0
language score: {'val/Bleu_1': 0.9999999999920699, 'val/Bleu_2': 0.0009999999999940523, 'val/Bleu_3': 9.999999999947137e-05, 'val/Bleu_4': 3.162277660152707e-05, 'val/ROUGE_L': 1.0, 'val/CIDEr': 1.9156274954912038}
final score: 0.8961230436827525
Please note than I am not familiar with language score metrics.
But there seems to be something wrong on Bleu_2, 3, 4?
They seem to be too low.
In the code, I need to run llama every time I reply to a word, is this a normal phenomenon? By the way, is the final assessment a validation collection or a test collection? Is it possible to finally evaluate the selected test set? I doubt that the evaluation of too many samples will bring a lot of computational burden.
Hi there, we're considering if the agent could benefit from more than just keyframe images to generate the right answers. Is that possible to use some extra infomations from nuscenes in this task, like continuous frame images or radar points, which can be used to obtain more precise velocity information?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.