fundamentalvision / bevformer Goto Github PK

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

Home Page: https://arxiv.org/abs/2203.17270

License: Apache License 2.0

Python 99.93% Shell 0.07%

deep-learning autonomous-driving computer-vision object-detection

bevformer's Introduction

BEVFormer: a Cutting-edge Baseline for Camera-based Detection

BEVFormer.mp4

BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers, ECCV 2022

Paper in arXiv | Paper in Chinese | OpenDriveLab

Slides in English | Occupancy and BEV Perception Talk Slides

Blog in Chinese | Video Talk and Slides (in Chinese)

BEV Perception Survey (Accepted by PAMI) | Github repo

News

[2022/6/16]: We added two BEVformer configurations, which require less GPU memory than the base version. Please pull this repo to obtain the latest codes.
[2022/6/13]: We release an initial version of BEVFormer. It achieves a baseline result of 51.7% NDS on nuScenes.
[2022/5/23]: 🚀🚀Built on top of BEVFormer, BEVFormer++, gathering up all best practices in recent SOTAs and our unique modification, ranks 1st on Waymo Open Datast 3D Camera-Only Detection Challenge. We will present BEVFormer++ on CVPR 2022 Autonomous Driving Workshop.
[2022/3/10]: 🚀BEVFormer achieve the SOTA on nuScenes Detection Task with 56.9% NDS (camera-only)!

Abstract

In this work, the authors present a new framework termed BEVFormer, which learns unified BEV representations with spatiotemporal transformers to support multiple autonomous driving perception tasks. In a nutshell, BEVFormer exploits both spatial and temporal information by interacting with spatial and temporal space through predefined grid-shaped BEV queries. To aggregate spatial information, the authors design a spatial cross-attention that each BEV query extracts the spatial features from the regions of interest across camera views. For temporal information, the authors propose a temporal self-attention to recurrently fuse the history BEV information. The proposed approach achieves the new state-of-the-art 56.9% in terms of NDS metric on the nuScenes test set, which is 9.0 points higher than previous best arts and on par with the performance of LiDAR-based baselines.

Methods

Getting Started

Model Zoo

Backbone	Method	Lr Schd	NDS	mAP	memroy	Config	Download
R50	BEVFormer-tiny_fp16	24ep	35.9	25.7	-	config	model/log
R50	BEVFormer-tiny	24ep	35.4	25.2	6500M	config	model/log
R101-DCN	BEVFormer-small	24ep	47.9	37.0	10500M	config	model/log
R101-DCN	BEVFormer-base	24ep	51.7	41.6	28500M	config	model/log
R50	BEVformerV2-t1-base	24ep	42.6	35.1	23952M	config	model/log
R50	BEVformerV2-t1-base	48ep	43.9	35.9	23952M	config	model/log
R50	BEVformerV2-t1	24ep	45.3	38.1	37579M	config	model/log
R50	BEVformerV2-t1	48ep	46.5	39.5	37579M	config	model/log
R50	BEVformerV2-t2	24ep	51.8	42.0	38954M	config	model/log
R50	BEVformerV2-t2	48ep	52.6	43.1	38954M	config	model/log
R50	BEVformerV2-t8	24ep	55.3	46.0	40392M	config	model/log

The Baidu Driver Link for (BEVFormerV2 model and log)[https://pan.baidu.com/s/1ynzlAt1DQbH8NkqmisatTw?pwd=fdcv] is here.

Catalog

Bibtex

If this work is helpful for your research, please consider citing the following BibTeX entry.

@article{li2022bevformer,
  title={BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers},
  author={Li, Zhiqi and Wang, Wenhai and Li, Hongyang and Xie, Enze and Sima, Chonghao and Lu, Tong and Qiao, Yu and Dai, Jifeng}
  journal={arXiv preprint arXiv:2203.17270},
  year={2022}
}
@article{Yang2022BEVFormerVA,
  title={BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision},
  author={Chenyu Yang and Yuntao Chen and Haofei Tian and Chenxin Tao and Xizhou Zhu and Zhaoxiang Zhang and Gao Huang and Hongyang Li and Y. Qiao and Lewei Lu and Jie Zhou and Jifeng Dai},
  journal={ArXiv},
  year={2022},
}

Acknowledgement

Many thanks to these excellent open source projects:

↳ Stargazers

↳ Forkers

bevformer's People

Contributors

Stargazers

Watchers

Forkers

zhyever jlqzzz 24werewolf synsin0 pengcheng001 deepbehavier huhenan collector-m meihuanshan yumianhuli2 zhangaigh iuk xczhanjun su-code zhumingxu love-lost-found neophack xqpinitial gudks bearswang gaojunbin zsh4614 moneypi zhangyp15 tuskaw qingsong99 ai-d-eng zl-su memberre leedonus irvingao leayz-888 zdddw zhuokunyao swpsgithub hailuo0112 yznmur einstein10147 ebrahim2029 doupongzeng jimmychang-jmc jiahongwu1995 gurpreet-singh121 weisili2016 tyzaizl gavinljj leedaga2 hihus jackyyvan ithink3iam iuriimattos2 tangrui2018 chenkangyang programmermw1986 drancon wang-hehe l-net-1992 lsr12345 sty61010 xijunke shengstar ywfwyht xiaokn wjxzju palfu yokko123 af-74413592 wayveai a1exr merdanbay mfkiwl panwangaz wangjiongw xiaocongcsu clw5180 albertipot kriskrisliu superaiken niko-zyf huichen98 lamhocn destinyls nuaasxr alerfaromeoo yesyu wyddmw lrakla 454615695 pyten fedannie ytep-zhi sgjq fangwudi monster012 leonoyz bosszhe npzl xbx-code jzhuucla jevenzh

bevformer's Issues

1 GPU training

hello, I used only 1 GPU to train, face with Error:
TypeError: can't pickle dict_keys objects
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2943) of binary: /usr/bin/python3

When will open source code

When will the code be released?

dist_train error : TypeError: cannot pickle 'dict_keys' object

when i use the command below to start the single gpu training, it runs successfully.

PYTHONPATH=. python ./tools/train.py ./projects/configs/bevformer/bevformer_base.py --deterministic

But when i start the 8 GPUs training, it reports error.

i don't know why??

./tools/dist_train.sh ./projects/configs/bevformer/bevformer_base.py 8

Traceback (most recent call last):
File "./tools/train.py", line 259, in
main()
File "./tools/train.py", line 248, in main
custom_train_model(
File "/mnt/cfs/jikai/projects/BEVFormer/projects/mmdet3d_plugin/bevformer/apis/train.py", line 26, in custom_train_model
custom_train_detector(
File "/mnt/cfs/jikai/projects/BEVFormer/projects/mmdet3d_plugin/bevformer/apis/mmdet_train.py", line 179, in custom_train_detector
runner.run(data_loaders, cfg.workflow)
File "/root/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/root/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
for i, data_batch in enumerate(self.data_loader):
File "/root/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 359, in iter
return self._get_iterator()
File "/root/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 305, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/root/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 918, in init
w.start()
File "/root/anaconda3/envs/py38/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/root/anaconda3/envs/py38/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/root/anaconda3/envs/py38/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/root/anaconda3/envs/py38/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/root/anaconda3/envs/py38/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/root/anaconda3/envs/py38/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/root/anaconda3/envs/py38/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'dict_keys' object

can not reimplement nds

I run test follow projects/BEVFormer/docs/getting_started.md . got result like this:

Saving metrics to: test/bevformer_base/Wed_Jun_15_23_27_40_2022/pts_bbox
mAP: 0.4166
mATE: 0.6725
mASE: 0.7089
mAOE: 1.5625
mAVE: 0.3936
mAAE: 0.1975
NDS: 0.4110
Eval time: 218.7s

The mAP looks good. but the NDS is extremely low.
Any ideas why this?

How to run this on my custom dataset

Hi,thanks for sharing code.I want to know how can i inference on my own datasets such as 1 frame from six cameras.

Are the multi-camera features multi-scale (output of FPN) or single scale (output of ResNet, stride=32) ?

Hello, I'm wondering if the multi-camera features are multi-scale (output of FPN) or single scale (output of ResNet, stride=32) ?
If they are not multi-scale, why not ?

Questions about sampling_offsets in spatial cross attention

Hi, Zhiqi.

To save GPU memory, you have divided the bev query (e.g. 500*500) into num_camera (e.g. 6) groups which has the shape, for example (6, 610, 256). And then input them to the function sampling_offsets. https://github.com/zhiqi-li/BEVFormer/blob/5d42632256c64742f74d8b1c68a3407dd2f81305/projects/mmdet3d_plugin/bevformer/modules/spatial_cross_attention.py#L249 That means for each bev query, It will get num_heads * num_levels * num_points offset xy (e.g 8 heads, 3 levels, 8 offset points). For most points that are projected under only one camera, that is okay. However, in the overlap area which means for some bev query or their reference points, they will hit 2 views. However for these bev queries, after the linear layer (i.e. sampling_offsets), the offset is the same for both the two views. Also, the attention weights I think are also the same for them. Because we not predict offsets or weights for all cameras for each bev query. Is that make sense? Or is there any special meaning to doing so?

In fact, in the process I reproduced before, I predicted the offset on each camera for each bev query. In order to save the GPU memory, I also only selected the mapped bev query for attention. But the offsets and weight are predicted for all the views. I just use the values (offsets and weights) that are mapped on the corresponding views for attention calculation. My final performance is not good. I am not sure if there are other bugs. But I am curious if the above is the influencing factor.

Thanks.

possible run it on agx Xavier?

Hello,
Since my computer is too cheap to possible to run the great module, i would like to try in on my toy jetson agx.

So I’m wondering if jetson agx’s GPU is ok for it ?

Thank you very much.

Training Collapse for BEVFormer-tiny and -small

Thanks for your wonderful work!
For BEVFormer-base, I can correctly reproduce the results, but when I ran your newly-updated BEVFormer-tiny and -small, the NDS and mAP on val set are quite low, nearly 0.0. The losses of -tiny are also converge much slower than your provided log, as shown bellow (epoch-1). I carefully check my config file and yours in the log, which is the same, so I wonder what causes the training collapse. I'll be highly appreciated if you can help me with this. Thanks in advance!

2022-06-19 13:38:05,211 - mmdet - INFO - workflow: [('train', 1)], max: 24 epochs
2022-06-19 13:38:05,212 - mmdet - INFO - Checkpoints will be saved to /data2/renrui/BEVFormer/work_dirs/bevformer_tiny by HardDiskBackend.
2022-06-19 13:40:00,011 - mmdet - INFO - Epoch [1][50/3517] lr: 7.973e-05, eta: 2 days, 5:46:49, time: 2.295, data_time: 1.499, memory: 8310, loss_cls: 1.4230, loss_bbox: 1.7745, d0.loss_cls: 1.5225, d0.loss_bbox: 1.7623, d1.loss_cls: 1.4561, d1.loss_bbox: 1.7451, d2.loss_cls: 1.4174, d2.loss_bbox: 1.7405, d3.loss_cls: 1.4219, d3.loss_bbox: 1.7527, d4.loss_cls: 1.4089, d4.loss_bbox: 1.7604, loss: 19.1854, grad_norm: 15.3972
2022-06-19 13:41:43,502 - mmdet - INFO - Epoch [1][100/3517] lr: 9.307e-05, eta: 2 days, 3:06:37, time: 2.070, data_time: 0.293, memory: 8310, loss_cls: 1.2347, loss_bbox: 1.6543, d0.loss_cls: 1.2324, d0.loss_bbox: 1.6314, d1.loss_cls: 1.2355, d1.loss_bbox: 1.6281, d2.loss_cls: 1.2333, d2.loss_bbox: 1.6461, d3.loss_cls: 1.2335, d3.loss_bbox: 1.6427, d4.loss_cls: 1.2315, d4.loss_bbox: 1.6508, loss: 17.2543, grad_norm: 14.0373
2022-06-19 13:43:30,490 - mmdet - INFO - Epoch [1][150/3517] lr: 1.064e-04, eta: 2 days, 2:44:49, time: 2.140, data_time: 0.057, memory: 8310, loss_cls: 1.2398, loss_bbox: 1.6987, d0.loss_cls: 1.2304, d0.loss_bbox: 1.6259, d1.loss_cls: 1.2353, d1.loss_bbox: 1.6227, d2.loss_cls: 1.2368, d2.loss_bbox: 1.6444, d3.loss_cls: 1.2382, d3.loss_bbox: 1.6612, d4.loss_cls: 1.2385, d4.loss_bbox: 1.6872, loss: 17.3591, grad_norm: 13.9613
2022-06-19 13:45:15,930 - mmdet - INFO - Epoch [1][200/3517] lr: 1.197e-04, eta: 2 days, 2:22:09, time: 2.109, data_time: 0.053, memory: 8310, loss_cls: 1.2173, loss_bbox: 1.6676, d0.loss_cls: 1.2033, d0.loss_bbox: 1.5747, d1.loss_cls: 1.2141, d1.loss_bbox: 1.5884, d2.loss_cls: 1.2155, d2.loss_bbox: 1.6084, d3.loss_cls: 1.2166, d3.loss_bbox: 1.6097, d4.loss_cls: 1.2149, d4.loss_bbox: 1.6229, loss: 16.9535, grad_norm: 13.8414
2022-06-19 13:46:56,368 - mmdet - INFO - Epoch [1][250/3517] lr: 1.331e-04, eta: 2 days, 1:39:47, time: 2.009, data_time: 0.053, memory: 8310, loss_cls: 1.2310, loss_bbox: 1.6154, d0.loss_cls: 1.2150, d0.loss_bbox: 1.5539, d1.loss_cls: 1.2295, d1.loss_bbox: 1.5583, d2.loss_cls: 1.2294, d2.loss_bbox: 1.5683, d3.loss_cls: 1.2316, d3.loss_bbox: 1.5917, d4.loss_cls: 1.2320, d4.loss_bbox: 1.5929, loss: 16.8490, grad_norm: 14.8824
2022-06-19 13:48:38,073 - mmdet - INFO - Epoch [1][300/3517] lr: 1.464e-04, eta: 2 days, 1:16:55, time: 2.034, data_time: 0.052, memory: 8310, loss_cls: 1.1999, loss_bbox: 1.6191, d0.loss_cls: 1.1781, d0.loss_bbox: 1.5335, d1.loss_cls: 1.1964, d1.loss_bbox: 1.5360, d2.loss_cls: 1.1969, d2.loss_bbox: 1.5506, d3.loss_cls: 1.2002, d3.loss_bbox: 1.5823, d4.loss_cls: 1.2045, d4.loss_bbox: 1.6101, loss: 16.6075, grad_norm: 14.2893
2022-06-19 13:50:17,373 - mmdet - INFO - Epoch [1][350/3517] lr: 1.597e-04, eta: 2 days, 0:50:27, time: 1.986, data_time: 0.519, memory: 8310, loss_cls: 1.1868, loss_bbox: 1.5881, d0.loss_cls: 1.1619, d0.loss_bbox: 1.5300, d1.loss_cls: 1.1862, d1.loss_bbox: 1.5380, d2.loss_cls: 1.1806, d2.loss_bbox: 1.5478, d3.loss_cls: 1.1843, d3.loss_bbox: 1.5531, d4.loss_cls: 1.1842, d4.loss_bbox: 1.5619, loss: 16.4029, grad_norm: 13.2489
2022-06-19 13:51:58,939 - mmdet - INFO - Epoch [1][400/3517] lr: 1.731e-04, eta: 2 days, 0:38:07, time: 2.031, data_time: 1.014, memory: 8310, loss_cls: 1.1719, loss_bbox: 1.5494, d0.loss_cls: 1.1455, d0.loss_bbox: 1.5124, d1.loss_cls: 1.1704, d1.loss_bbox: 1.5152, d2.loss_cls: 1.1729, d2.loss_bbox: 1.5152, d3.loss_cls: 1.1747, d3.loss_bbox: 1.5252, d4.loss_cls: 1.1700, d4.loss_bbox: 1.5356, loss: 16.1583, grad_norm: 12.9210
2022-06-19 13:53:38,008 - mmdet - INFO - Epoch [1][450/3517] lr: 1.864e-04, eta: 2 days, 0:20:25, time: 1.981, data_time: 0.985, memory: 8310, loss_cls: 1.1851, loss_bbox: 1.5371, d0.loss_cls: 1.1511, d0.loss_bbox: 1.5007, d1.loss_cls: 1.1808, d1.loss_bbox: 1.4966, d2.loss_cls: 1.1841, d2.loss_bbox: 1.5046, d3.loss_cls: 1.1818, d3.loss_bbox: 1.5162, d4.loss_cls: 1.1821, d4.loss_bbox: 1.5188, loss: 16.1391, grad_norm: 13.4826
2022-06-19 13:55:28,566 - mmdet - INFO - Epoch [1][500/3517] lr: 1.997e-04, eta: 2 days, 0:38:02, time: 2.211, data_time: 1.408, memory: 8310, loss_cls: 1.1820, loss_bbox: 1.5330, d0.loss_cls: 1.1463, d0.loss_bbox: 1.4974, d1.loss_cls: 1.1791, d1.loss_bbox: 1.5001, d2.loss_cls: 1.1808, d2.loss_bbox: 1.5112, d3.loss_cls: 1.1769, d3.loss_bbox: 1.5175, d4.loss_cls: 1.1783, d4.loss_bbox: 1.5421, loss: 16.1447, grad_norm: 13.4236
2022-06-19 13:57:22,188 - mmdet - INFO - Epoch [1][550/3517] lr: 2.000e-04, eta: 2 days, 0:59:52, time: 2.272, data_time: 1.513, memory: 8310, loss_cls: 1.1716, loss_bbox: 1.5005, d0.loss_cls: 1.1249, d0.loss_bbox: 1.4670, d1.loss_cls: 1.1627, d1.loss_bbox: 1.4679, d2.loss_cls: 1.1645, d2.loss_bbox: 1.4859, d3.loss_cls: 1.1652, d3.loss_bbox: 1.4860, d4.loss_cls: 1.1694, d4.loss_bbox: 1.5007, loss: 15.8663, grad_norm: 13.3508
2022-06-19 13:59:41,404 - mmdet - INFO - Epoch [1][600/3517] lr: 2.000e-04, eta: 2 days, 2:17:24, time: 2.785, data_time: 2.050, memory: 8310, loss_cls: 1.1812, loss_bbox: 1.5001, d0.loss_cls: 1.1188, d0.loss_bbox: 1.4543, d1.loss_cls: 1.1635, d1.loss_bbox: 1.4438, d2.loss_cls: 1.1676, d2.loss_bbox: 1.4504, d3.loss_cls: 1.1719, d3.loss_bbox: 1.4655, d4.loss_cls: 1.1724, d4.loss_bbox: 1.4807, loss: 15.7703, grad_norm: 12.9131
2022-06-19 14:01:46,532 - mmdet - INFO - Epoch [1][650/3517] lr: 2.000e-04, eta: 2 days, 2:52:19, time: 2.502, data_time: 1.847, memory: 8310, loss_cls: 1.1848, loss_bbox: 1.4480, d0.loss_cls: 1.1171, d0.loss_bbox: 1.4418, d1.loss_cls: 1.1536, d1.loss_bbox: 1.4413, d2.loss_cls: 1.1741, d2.loss_bbox: 1.4411, d3.loss_cls: 1.1831, d3.loss_bbox: 1.4391, d4.loss_cls: 1.1824, d4.loss_bbox: 1.4406, loss: 15.6469, grad_norm: 13.0838
2022-06-19 14:03:27,945 - mmdet - INFO - Epoch [1][700/3517] lr: 2.000e-04, eta: 2 days, 2:34:46, time: 2.029, data_time: 1.001, memory: 8310, loss_cls: 1.1616, loss_bbox: 1.4824, d0.loss_cls: 1.0830, d0.loss_bbox: 1.4695, d1.loss_cls: 1.1005, d1.loss_bbox: 1.4734, d2.loss_cls: 1.1306, d2.loss_bbox: 1.4631, d3.loss_cls: 1.1523, d3.loss_bbox: 1.4703, d4.loss_cls: 1.1605, d4.loss_bbox: 1.4697, loss: 15.6169, grad_norm: 12.2658
2022-06-19 14:05:13,775 - mmdet - INFO - Epoch [1][750/3517] lr: 2.000e-04, eta: 2 days, 2:27:30, time: 2.117, data_time: 0.058, memory: 8310, loss_cls: 1.1641, loss_bbox: 1.4767, d0.loss_cls: 1.0847, d0.loss_bbox: 1.4694, d1.loss_cls: 1.0999, d1.loss_bbox: 1.4646, d2.loss_cls: 1.1211, d2.loss_bbox: 1.4630, d3.loss_cls: 1.1414, d3.loss_bbox: 1.4614, d4.loss_cls: 1.1595, d4.loss_bbox: 1.4788, loss: 15.5847, grad_norm: 12.2048
2022-06-19 14:06:48,141 - mmdet - INFO - Epoch [1][800/3517] lr: 2.000e-04, eta: 2 days, 2:00:57, time: 1.887, data_time: 0.056, memory: 8310, loss_cls: 1.1438, loss_bbox: 1.4454, d0.loss_cls: 1.0647, d0.loss_bbox: 1.4276, d1.loss_cls: 1.0770, d1.loss_bbox: 1.4408, d2.loss_cls: 1.0852, d2.loss_bbox: 1.4431, d3.loss_cls: 1.0907, d3.loss_bbox: 1.4434, d4.loss_cls: 1.1135, d4.loss_bbox: 1.4418, loss: 15.2169, grad_norm: 12.2401
2022-06-19 14:08:35,720 - mmdet - INFO - Epoch [1][850/3517] lr: 2.000e-04, eta: 2 days, 1:58:59, time: 2.152, data_time: 0.091, memory: 8310, loss_cls: 1.1310, loss_bbox: 1.4912, d0.loss_cls: 1.0874, d0.loss_bbox: 1.4561, d1.loss_cls: 1.1042, d1.loss_bbox: 1.4623, d2.loss_cls: 1.1068, d2.loss_bbox: 1.4693, d3.loss_cls: 1.1150, d3.loss_bbox: 1.4926, d4.loss_cls: 1.1156, d4.loss_bbox: 1.4942, loss: 15.5257, grad_norm: 12.6945
2022-06-19 14:10:13,225 - mmdet - INFO - Epoch [1][900/3517] lr: 2.000e-04, eta: 2 days, 1:41:28, time: 1.950, data_time: 0.840, memory: 8310, loss_cls: 1.0950, loss_bbox: 1.4788, d0.loss_cls: 1.0653, d0.loss_bbox: 1.4261, d1.loss_cls: 1.0787, d1.loss_bbox: 1.4383, d2.loss_cls: 1.0797, d2.loss_bbox: 1.4491, d3.loss_cls: 1.0851, d3.loss_bbox: 1.4598, d4.loss_cls: 1.0939, d4.loss_bbox: 1.4673, loss: 15.2171, grad_norm: 12.4145
2022-06-19 14:11:51,079 - mmdet - INFO - Epoch [1][950/3517] lr: 2.000e-04, eta: 2 days, 1:26:08, time: 1.957, data_time: 1.216, memory: 8310, loss_cls: 1.1024, loss_bbox: 1.4917, d0.loss_cls: 1.0769, d0.loss_bbox: 1.4127, d1.loss_cls: 1.0907, d1.loss_bbox: 1.4298, d2.loss_cls: 1.0902, d2.loss_bbox: 1.4403, d3.loss_cls: 1.0959, d3.loss_bbox: 1.4649, d4.loss_cls: 1.0963, d4.loss_bbox: 1.4749, loss: 15.2668, grad_norm: 13.8252
2022-06-19 14:13:33,357 - mmdet - INFO - Exp name: bevformer_tiny.py
2022-06-19 14:13:33,357 - mmdet - INFO - Epoch [1][1000/3517] lr: 2.000e-04, eta: 2 days, 1:18:19, time: 2.046, data_time: 1.422, memory: 8310, loss_cls: 1.1023, loss_bbox: 1.4921, d0.loss_cls: 1.0668, d0.loss_bbox: 1.3972, d1.loss_cls: 1.0799, d1.loss_bbox: 1.4024, d2.loss_cls: 1.0790, d2.loss_bbox: 1.4339, d3.loss_cls: 1.0845, d3.loss_bbox: 1.4307, d4.loss_cls: 1.0771, d4.loss_bbox: 1.4376, loss: 15.0836, grad_norm: 21.4151
2022-06-19 14:15:11,365 - mmdet - INFO - Epoch [1][1050/3517] lr: 2.000e-04, eta: 2 days, 1:05:26, time: 1.960, data_time: 1.335, memory: 8310, loss_cls: 1.0963, loss_bbox: 1.4981, d0.loss_cls: 1.0647, d0.loss_bbox: 1.4209, d1.loss_cls: 1.0724, d1.loss_bbox: 1.4372, d2.loss_cls: 1.0790, d2.loss_bbox: 1.4585, d3.loss_cls: 1.0842, d3.loss_bbox: 1.4556, d4.loss_cls: 1.0784, d4.loss_bbox: 1.4559, loss: 15.2012, grad_norm: 20.1478
2022-06-19 14:16:54,996 - mmdet - INFO - Epoch [1][1100/3517] lr: 2.000e-04, eta: 2 days, 1:00:40, time: 2.073, data_time: 1.447, memory: 8310, loss_cls: 1.0856, loss_bbox: 1.4316, d0.loss_cls: 1.0518, d0.loss_bbox: 1.3981, d1.loss_cls: 1.0685, d1.loss_bbox: 1.4015, d2.loss_cls: 1.0755, d2.loss_bbox: 1.3973, d3.loss_cls: 1.0783, d3.loss_bbox: 1.3915, d4.loss_cls: 1.0706, d4.loss_bbox: 1.4170, loss: 14.8674, grad_norm: 22.9469
2022-06-19 14:18:31,712 - mmdet - INFO - Epoch [1][1150/3517] lr: 2.000e-04, eta: 2 days, 0:47:49, time: 1.934, data_time: 1.305, memory: 8310, loss_cls: 1.0848, loss_bbox: 1.4048, d0.loss_cls: 1.0560, d0.loss_bbox: 1.4015, d1.loss_cls: 1.0714, d1.loss_bbox: 1.3927, d2.loss_cls: 1.0811, d2.loss_bbox: 1.4003, d3.loss_cls: 1.0777, d3.loss_bbox: 1.4026, d4.loss_cls: 1.0751, d4.loss_bbox: 1.4173, loss: 14.8652, grad_norm: 16.7234
2022-06-19 14:20:19,457 - mmdet - INFO - Epoch [1][1200/3517] lr: 2.000e-04, eta: 2 days, 0:48:40, time: 2.155, data_time: 1.534, memory: 8310, loss_cls: 1.0794, loss_bbox: 1.4052, d0.loss_cls: 1.0595, d0.loss_bbox: 1.3996, d1.loss_cls: 1.0774, d1.loss_bbox: 1.4051, d2.loss_cls: 1.0769, d2.loss_bbox: 1.4055, d3.loss_cls: 1.0742, d3.loss_bbox: 1.4029, d4.loss_cls: 1.0757, d4.loss_bbox: 1.4081, loss: 14.8693, grad_norm: 19.3724
2022-06-19 14:22:03,734 - mmdet - INFO - Epoch [1][1250/3517] lr: 2.000e-04, eta: 2 days, 0:45:27, time: 2.086, data_time: 1.476, memory: 8310, loss_cls: 1.0754, loss_bbox: 1.4114, d0.loss_cls: 1.0548, d0.loss_bbox: 1.3911, d1.loss_cls: 1.0670, d1.loss_bbox: 1.3868, d2.loss_cls: 1.0684, d2.loss_bbox: 1.3918, d3.loss_cls: 1.0729, d3.loss_bbox: 1.3905, d4.loss_cls: 1.0707, d4.loss_bbox: 1.4191, loss: 14.7999, grad_norm: 19.2040
2022-06-19 14:23:47,580 - mmdet - INFO - Epoch [1][1300/3517] lr: 2.000e-04, eta: 2 days, 0:41:53, time: 2.077, data_time: 1.448, memory: 8310, loss_cls: 1.0452, loss_bbox: 1.3716, d0.loss_cls: 1.0239, d0.loss_bbox: 1.3544, d1.loss_cls: 1.0383, d1.loss_bbox: 1.3520, d2.loss_cls: 1.0392, d2.loss_bbox: 1.3491, d3.loss_cls: 1.0415, d3.loss_bbox: 1.3562, d4.loss_cls: 1.0398, d4.loss_bbox: 1.3583, loss: 14.3696, grad_norm: 12.8295
2022-06-19 14:25:26,672 - mmdet - INFO - Epoch [1][1350/3517] lr: 2.000e-04, eta: 2 days, 0:33:35, time: 1.982, data_time: 1.358, memory: 8310, loss_cls: 1.0446, loss_bbox: 1.3737, d0.loss_cls: 1.0285, d0.loss_bbox: 1.3569, d1.loss_cls: 1.0363, d1.loss_bbox: 1.3572, d2.loss_cls: 1.0360, d2.loss_bbox: 1.3592, d3.loss_cls: 1.0397, d3.loss_bbox: 1.3565, d4.loss_cls: 1.0384, d4.loss_bbox: 1.3681, loss: 14.3950, grad_norm: 16.9966
2022-06-19 14:27:06,266 - mmdet - INFO - Epoch [1][1400/3517] lr: 2.000e-04, eta: 2 days, 0:26:15, time: 1.992, data_time: 1.358, memory: 8310, loss_cls: 1.0587, loss_bbox: 1.3815, d0.loss_cls: 1.0459, d0.loss_bbox: 1.3628, d1.loss_cls: 1.0530, d1.loss_bbox: 1.3772, d2.loss_cls: 1.0525, d2.loss_bbox: 1.3765, d3.loss_cls: 1.0539, d3.loss_bbox: 1.3808, d4.loss_cls: 1.0562, d4.loss_bbox: 1.3783, loss: 14.5772, grad_norm: 16.5173
2022-06-19 14:28:48,379 - mmdet - INFO - Epoch [1][1450/3517] lr: 2.000e-04, eta: 2 days, 0:21:42, time: 2.042, data_time: 0.907, memory: 8310, loss_cls: 1.0380, loss_bbox: 1.3651, d0.loss_cls: 1.0309, d0.loss_bbox: 1.3462, d1.loss_cls: 1.0323, d1.loss_bbox: 1.3492, d2.loss_cls: 1.0336, d2.loss_bbox: 1.3465, d3.loss_cls: 1.0343, d3.loss_bbox: 1.3531, d4.loss_cls: 1.0347, d4.loss_bbox: 1.3676, loss: 14.3316, grad_norm: 13.8195
2022-06-19 14:30:30,500 - mmdet - INFO - Epoch [1][1500/3517] lr: 2.000e-04, eta: 2 days, 0:17:22, time: 2.043, data_time: 0.409, memory: 8310, loss_cls: 1.0068, loss_bbox: 1.3606, d0.loss_cls: 0.9975, d0.loss_bbox: 1.3361, d1.loss_cls: 0.9998, d1.loss_bbox: 1.3410, d2.loss_cls: 1.0024, d2.loss_bbox: 1.3396, d3.loss_cls: 1.0025, d3.loss_bbox: 1.3388, d4.loss_cls: 1.0027, d4.loss_bbox: 1.3455, loss: 14.0735, grad_norm: 13.8750
2022-06-19 14:32:16,979 - mmdet - INFO - Epoch [1][1550/3517] lr: 2.000e-04, eta: 2 days, 0:17:05, time: 2.130, data_time: 0.058, memory: 8310, loss_cls: 1.0300, loss_bbox: 1.3558, d0.loss_cls: 1.0192, d0.loss_bbox: 1.3435, d1.loss_cls: 1.0206, d1.loss_bbox: 1.3413, d2.loss_cls: 1.0246, d2.loss_bbox: 1.3443, d3.loss_cls: 1.0254, d3.loss_bbox: 1.3476, d4.loss_cls: 1.0241, d4.loss_bbox: 1.3511, loss: 14.2276, grad_norm: 12.5753
2022-06-19 14:33:57,933 - mmdet - INFO - Epoch [1][1600/3517] lr: 2.000e-04, eta: 2 days, 0:11:56, time: 2.019, data_time: 0.863, memory: 8310, loss_cls: 1.0251, loss_bbox: 1.3649, d0.loss_cls: 1.0151, d0.loss_bbox: 1.3422, d1.loss_cls: 1.0160, d1.loss_bbox: 1.3483, d2.loss_cls: 1.0194, d2.loss_bbox: 1.3518, d3.loss_cls: 1.0217, d3.loss_bbox: 1.3564, d4.loss_cls: 1.0201, d4.loss_bbox: 1.3634, loss: 14.2443, grad_norm: 17.3118
2022-06-19 14:35:46,271 - mmdet - INFO - Epoch [1][1650/3517] lr: 2.000e-04, eta: 2 days, 0:13:10, time: 2.167, data_time: 1.269, memory: 8310, loss_cls: 1.0202, loss_bbox: 1.3679, d0.loss_cls: 1.0110, d0.loss_bbox: 1.3397, d1.loss_cls: 1.0151, d1.loss_bbox: 1.3649, d2.loss_cls: 1.0162, d2.loss_bbox: 1.3613, d3.loss_cls: 1.0183, d3.loss_bbox: 1.3645, d4.loss_cls: 1.0178, d4.loss_bbox: 1.3685, loss: 14.2654, grad_norm: 12.8501
2022-06-19 14:37:25,865 - mmdet - INFO - Epoch [1][1700/3517] lr: 2.000e-04, eta: 2 days, 0:07:08, time: 1.992, data_time: 0.865, memory: 8310, loss_cls: 1.0283, loss_bbox: 1.3505, d0.loss_cls: 1.0150, d0.loss_bbox: 1.3479, d1.loss_cls: 1.0157, d1.loss_bbox: 1.3489, d2.loss_cls: 1.0197, d2.loss_bbox: 1.3496, d3.loss_cls: 1.0215, d3.loss_bbox: 1.3468, d4.loss_cls: 1.0239, d4.loss_bbox: 1.3501, loss: 14.2179, grad_norm: 12.0612
2022-06-19 14:39:11,900 - mmdet - INFO - Epoch [1][1750/3517] lr: 2.000e-04, eta: 2 days, 0:06:25, time: 2.121, data_time: 1.390, memory: 8310, loss_cls: 1.0116, loss_bbox: 1.3436, d0.loss_cls: 1.0005, d0.loss_bbox: 1.3282, d1.loss_cls: 1.0022, d1.loss_bbox: 1.3255, d2.loss_cls: 1.0072, d2.loss_bbox: 1.3262, d3.loss_cls: 1.0084, d3.loss_bbox: 1.3348, d4.loss_cls: 1.0074, d4.loss_bbox: 1.3410, loss: 14.0368, grad_norm: 14.5245
2022-06-19 14:40:50,018 - mmdet - INFO - Epoch [1][1800/3517] lr: 2.000e-04, eta: 1 day, 23:59:36, time: 1.962, data_time: 1.196, memory: 8310, loss_cls: 1.0294, loss_bbox: 1.3550, d0.loss_cls: 1.0205, d0.loss_bbox: 1.3301, d1.loss_cls: 1.0221, d1.loss_bbox: 1.3381, d2.loss_cls: 1.0256, d2.loss_bbox: 1.3352, d3.loss_cls: 1.0268, d3.loss_bbox: 1.3375, d4.loss_cls: 1.0258, d4.loss_bbox: 1.3454, loss: 14.1915, grad_norm: 12.3590
2022-06-19 14:42:33,277 - mmdet - INFO - Epoch [1][1850/3517] lr: 2.000e-04, eta: 1 day, 23:56:52, time: 2.065, data_time: 1.247, memory: 8310, loss_cls: 1.0069, loss_bbox: 1.3243, d0.loss_cls: 0.9958, d0.loss_bbox: 1.3053, d1.loss_cls: 0.9974, d1.loss_bbox: 1.3069, d2.loss_cls: 0.9973, d2.loss_bbox: 1.3138, d3.loss_cls: 0.9977, d3.loss_bbox: 1.3169, d4.loss_cls: 1.0018, d4.loss_bbox: 1.3248, loss: 13.8888, grad_norm: 12.2397
2022-06-19 14:44:10,961 - mmdet - INFO - Epoch [1][1900/3517] lr: 2.000e-04, eta: 1 day, 23:50:10, time: 1.954, data_time: 1.180, memory: 8310, loss_cls: 1.0325, loss_bbox: 1.3307, d0.loss_cls: 1.0254, d0.loss_bbox: 1.3146, d1.loss_cls: 1.0262, d1.loss_bbox: 1.3122, d2.loss_cls: 1.0268, d2.loss_bbox: 1.3148, d3.loss_cls: 1.0274, d3.loss_bbox: 1.3189, d4.loss_cls: 1.0305, d4.loss_bbox: 1.3252, loss: 14.0852, grad_norm: 13.0078
2022-06-19 14:45:57,403 - mmdet - INFO - Epoch [1][1950/3517] lr: 2.000e-04, eta: 1 day, 23:49:54, time: 2.129, data_time: 0.118, memory: 8310, loss_cls: 1.0186, loss_bbox: 1.3295, d0.loss_cls: 1.0118, d0.loss_bbox: 1.3014, d1.loss_cls: 1.0129, d1.loss_bbox: 1.3034, d2.loss_cls: 1.0151, d2.loss_bbox: 1.3055, d3.loss_cls: 1.0148, d3.loss_bbox: 1.3127, d4.loss_cls: 1.0157, d4.loss_bbox: 1.3147, loss: 13.9562, grad_norm: 15.7093
2022-06-19 14:47:36,333 - mmdet - INFO - Exp name: bevformer_tiny.py
2022-06-19 14:47:36,334 - mmdet - INFO - Epoch [1][2000/3517] lr: 2.000e-04, eta: 1 day, 23:44:23, time: 1.979, data_time: 0.060, memory: 8310, loss_cls: 1.0370, loss_bbox: 1.3332, d0.loss_cls: 1.0253, d0.loss_bbox: 1.3227, d1.loss_cls: 1.0275, d1.loss_bbox: 1.3295, d2.loss_cls: 1.0297, d2.loss_bbox: 1.3294, d3.loss_cls: 1.0319, d3.loss_bbox: 1.3286, d4.loss_cls: 1.0328, d4.loss_bbox: 1.3349, loss: 14.1625, grad_norm: 12.9693
2022-06-19 14:49:15,844 - mmdet - INFO - Epoch [1][2050/3517] lr: 2.000e-04, eta: 1 day, 23:39:28, time: 1.990, data_time: 0.059, memory: 8310, loss_cls: 1.0175, loss_bbox: 1.3304, d0.loss_cls: 1.0096, d0.loss_bbox: 1.3026, d1.loss_cls: 1.0118, d1.loss_bbox: 1.3081, d2.loss_cls: 1.0127, d2.loss_bbox: 1.3094, d3.loss_cls: 1.0126, d3.loss_bbox: 1.3223, d4.loss_cls: 1.0133, d4.loss_bbox: 1.3224, loss: 13.9727, grad_norm: 13.4800
2022-06-19 14:50:53,968 - mmdet - INFO - Epoch [1][2100/3517] lr: 2.000e-04, eta: 1 day, 23:33:47, time: 1.962, data_time: 0.062, memory: 8310, loss_cls: 0.9996, loss_bbox: 1.3232, d0.loss_cls: 0.9956, d0.loss_bbox: 1.3052, d1.loss_cls: 0.9943, d1.loss_bbox: 1.3008, d2.loss_cls: 0.9974, d2.loss_bbox: 1.3040, d3.loss_cls: 0.9977, d3.loss_bbox: 1.3143, d4.loss_cls: 0.9988, d4.loss_bbox: 1.3190, loss: 13.8500, grad_norm: 22.6644
2022-06-19 14:52:37,506 - mmdet - INFO - Epoch [1][2150/3517] lr: 2.000e-04, eta: 1 day, 23:31:44, time: 2.071, data_time: 0.077, memory: 8310, loss_cls: 1.0206, loss_bbox: 1.3067, d0.loss_cls: 1.0109, d0.loss_bbox: 1.2877, d1.loss_cls: 1.0123, d1.loss_bbox: 1.2885, d2.loss_cls: 1.0165, d2.loss_bbox: 1.2888, d3.loss_cls: 1.0162, d3.loss_bbox: 1.2893, d4.loss_cls: 1.0188, d4.loss_bbox: 1.2972, loss: 13.8536, grad_norm: 11.9265
2022-06-19 14:54:17,472 - mmdet - INFO - Epoch [1][2200/3517] lr: 2.000e-04, eta: 1 day, 23:27:30, time: 1.999, data_time: 0.557, memory: 8310, loss_cls: 1.0134, loss_bbox: 1.3006, d0.loss_cls: 1.0019, d0.loss_bbox: 1.2849, d1.loss_cls: 1.0023, d1.loss_bbox: 1.2828, d2.loss_cls: 1.0069, d2.loss_bbox: 1.2791, d3.loss_cls: 1.0090, d3.loss_bbox: 1.2892, d4.loss_cls: 1.0089, d4.loss_bbox: 1.2925, loss: 13.7714, grad_norm: 13.6617
2022-06-19 14:56:02,126 - mmdet - INFO - Epoch [1][2250/3517] lr: 2.000e-04, eta: 1 day, 23:26:13, time: 2.093, data_time: 1.333, memory: 8310, loss_cls: 1.0234, loss_bbox: 1.2974, d0.loss_cls: 1.0086, d0.loss_bbox: 1.2714, d1.loss_cls: 1.0079, d1.loss_bbox: 1.2725, d2.loss_cls: 1.0132, d2.loss_bbox: 1.2707, d3.loss_cls: 1.0133, d3.loss_bbox: 1.2772, d4.loss_cls: 1.0149, d4.loss_bbox: 1.2770, loss: 13.7475, grad_norm: 12.4950
2022-06-19 14:57:46,736 - mmdet - INFO - Epoch [1][2300/3517] lr: 2.000e-04, eta: 1 day, 23:24:53, time: 2.092, data_time: 1.468, memory: 8310, loss_cls: 1.0197, loss_bbox: 1.3174, d0.loss_cls: 1.0086, d0.loss_bbox: 1.2988, d1.loss_cls: 1.0119, d1.loss_bbox: 1.2995, d2.loss_cls: 1.0132, d2.loss_bbox: 1.3003, d3.loss_cls: 1.0159, d3.loss_bbox: 1.3078, d4.loss_cls: 1.0158, d4.loss_bbox: 1.3148, loss: 13.9236, grad_norm: 11.8553
2022-06-19 14:59:25,087 - mmdet - INFO - Epoch [1][2350/3517] lr: 2.000e-04, eta: 1 day, 23:19:54, time: 1.967, data_time: 1.340, memory: 8310, loss_cls: 1.0158, loss_bbox: 1.2873, d0.loss_cls: 1.0048, d0.loss_bbox: 1.2689, d1.loss_cls: 1.0072, d1.loss_bbox: 1.2732, d2.loss_cls: 1.0097, d2.loss_bbox: 1.2817, d3.loss_cls: 1.0108, d3.loss_bbox: 1.2833, d4.loss_cls: 1.0137, d4.loss_bbox: 1.2863, loss: 13.7426, grad_norm: 11.8566
2022-06-19 15:01:06,957 - mmdet - INFO - Epoch [1][2400/3517] lr: 2.000e-04, eta: 1 day, 23:17:03, time: 2.037, data_time: 1.417, memory: 8310, loss_cls: 1.0027, loss_bbox: 1.2903, d0.loss_cls: 0.9913, d0.loss_bbox: 1.2685, d1.loss_cls: 0.9925, d1.loss_bbox: 1.2668, d2.loss_cls: 0.9973, d2.loss_bbox: 1.2704, d3.loss_cls: 0.9991, d3.loss_bbox: 1.2721, d4.loss_cls: 1.0002, d4.loss_bbox: 1.2798, loss: 13.6309, grad_norm: 11.3735
2022-06-19 15:02:43,564 - mmdet - INFO - Epoch [1][2450/3517] lr: 2.000e-04, eta: 1 day, 23:11:19, time: 1.932, data_time: 1.312, memory: 8310, loss_cls: 1.0221, loss_bbox: 1.2982, d0.loss_cls: 1.0130, d0.loss_bbox: 1.2779, d1.loss_cls: 1.0110, d1.loss_bbox: 1.2840, d2.loss_cls: 1.0142, d2.loss_bbox: 1.2853, d3.loss_cls: 1.0162, d3.loss_bbox: 1.2931, d4.loss_cls: 1.0183, d4.loss_bbox: 1.2983, loss: 13.8317, grad_norm: 11.7601
2022-06-19 15:04:35,786 - mmdet - INFO - Epoch [1][2500/3517] lr: 2.000e-04, eta: 1 day, 23:14:17, time: 2.244, data_time: 1.626, memory: 8310, loss_cls: 1.0181, loss_bbox: 1.2741, d0.loss_cls: 1.0094, d0.loss_bbox: 1.2538, d1.loss_cls: 1.0091, d1.loss_bbox: 1.2521, d2.loss_cls: 1.0119, d2.loss_bbox: 1.2553, d3.loss_cls: 1.0137, d3.loss_bbox: 1.2621, d4.loss_cls: 1.0146, d4.loss_bbox: 1.2659, loss: 13.6403, grad_norm: 12.7352
2022-06-19 15:06:16,745 - mmdet - INFO - Epoch [1][2550/3517] lr: 2.000e-04, eta: 1 day, 23:11:01, time: 2.019, data_time: 1.394, memory: 8310, loss_cls: 1.0285, loss_bbox: 1.2988, d0.loss_cls: 1.0227, d0.loss_bbox: 1.2843, d1.loss_cls: 1.0219, d1.loss_bbox: 1.2847, d2.loss_cls: 1.0247, d2.loss_bbox: 1.2827, d3.loss_cls: 1.0273, d3.loss_bbox: 1.2887, d4.loss_cls: 1.0291, d4.loss_bbox: 1.2947, loss: 13.8880, grad_norm: 11.8988
2022-06-19 15:08:01,627 - mmdet - INFO - Epoch [1][2600/3517] lr: 2.000e-04, eta: 1 day, 23:09:53, time: 2.098, data_time: 1.475, memory: 8310, loss_cls: 1.0148, loss_bbox: 1.2815, d0.loss_cls: 1.0081, d0.loss_bbox: 1.2652, d1.loss_cls: 1.0066, d1.loss_bbox: 1.2669, d2.loss_cls: 1.0089, d2.loss_bbox: 1.2699, d3.loss_cls: 1.0115, d3.loss_bbox: 1.2711, d4.loss_cls: 1.0133, d4.loss_bbox: 1.2709, loss: 13.6888, grad_norm: 11.6102
2022-06-19 15:09:41,557 - mmdet - INFO - Epoch [1][2650/3517] lr: 2.000e-04, eta: 1 day, 23:06:11, time: 1.999, data_time: 1.383, memory: 8310, loss_cls: 1.0148, loss_bbox: 1.2937, d0.loss_cls: 1.0104, d0.loss_bbox: 1.2727, d1.loss_cls: 1.0110, d1.loss_bbox: 1.2745, d2.loss_cls: 1.0108, d2.loss_bbox: 1.2772, d3.loss_cls: 1.0124, d3.loss_bbox: 1.2811, d4.loss_cls: 1.0123, d4.loss_bbox: 1.2902, loss: 13.7611, grad_norm: 13.3964
2022-06-19 15:11:26,932 - mmdet - INFO - Epoch [1][2700/3517] lr: 2.000e-04, eta: 1 day, 23:05:17, time: 2.108, data_time: 1.490, memory: 8310, loss_cls: 0.9928, loss_bbox: 1.2706, d0.loss_cls: 0.9883, d0.loss_bbox: 1.2423, d1.loss_cls: 0.9885, d1.loss_bbox: 1.2423, d2.loss_cls: 0.9896, d2.loss_bbox: 1.2512, d3.loss_cls: 0.9904, d3.loss_bbox: 1.2560, d4.loss_cls: 0.9918, d4.loss_bbox: 1.2568, loss: 13.4605, grad_norm: 11.8606
2022-06-19 15:13:04,952 - mmdet - INFO - Epoch [1][2750/3517] lr: 2.000e-04, eta: 1 day, 23:00:44, time: 1.960, data_time: 1.341, memory: 8310, loss_cls: 1.0126, loss_bbox: 1.2792, d0.loss_cls: 1.0065, d0.loss_bbox: 1.2514, d1.loss_cls: 1.0064, d1.loss_bbox: 1.2622, d2.loss_cls: 1.0075, d2.loss_bbox: 1.2623, d3.loss_cls: 1.0103, d3.loss_bbox: 1.2652, d4.loss_cls: 1.0110, d4.loss_bbox: 1.2670, loss: 13.6416, grad_norm: 11.8591
2022-06-19 15:14:48,985 - mmdet - INFO - Epoch [1][2800/3517] lr: 2.000e-04, eta: 1 day, 22:59:12, time: 2.081, data_time: 1.460, memory: 8310, loss_cls: 1.0117, loss_bbox: 1.3044, d0.loss_cls: 1.0100, d0.loss_bbox: 1.2736, d1.loss_cls: 1.0078, d1.loss_bbox: 1.2788, d2.loss_cls: 1.0081, d2.loss_bbox: 1.2854, d3.loss_cls: 1.0118, d3.loss_bbox: 1.2968, d4.loss_cls: 1.0109, d4.loss_bbox: 1.2967, loss: 13.7957, grad_norm: 12.2295
2022-06-19 15:16:28,193 - mmdet - INFO - Epoch [1][2850/3517] lr: 2.000e-04, eta: 1 day, 22:55:22, time: 1.984, data_time: 1.368, memory: 8310, loss_cls: 1.0222, loss_bbox: 1.2781, d0.loss_cls: 1.0194, d0.loss_bbox: 1.2660, d1.loss_cls: 1.0165, d1.loss_bbox: 1.2674, d2.loss_cls: 1.0203, d2.loss_bbox: 1.2671, d3.loss_cls: 1.0231, d3.loss_bbox: 1.2779, d4.loss_cls: 1.0208, d4.loss_bbox: 1.2820, loss: 13.7608, grad_norm: 11.8544
2022-06-19 15:18:11,500 - mmdet - INFO - Epoch [1][2900/3517] lr: 2.000e-04, eta: 1 day, 22:53:31, time: 2.066, data_time: 1.447, memory: 8310, loss_cls: 1.0065, loss_bbox: 1.2756, d0.loss_cls: 1.0007, d0.loss_bbox: 1.2645, d1.loss_cls: 1.0003, d1.loss_bbox: 1.2653, d2.loss_cls: 1.0035, d2.loss_bbox: 1.2626, d3.loss_cls: 1.0069, d3.loss_bbox: 1.2737, d4.loss_cls: 1.0060, d4.loss_bbox: 1.2759, loss: 13.6414, grad_norm: 11.6093
2022-06-19 15:19:49,061 - mmdet - INFO - Epoch [1][2950/3517] lr: 2.000e-04, eta: 1 day, 22:49:02, time: 1.951, data_time: 1.331, memory: 8310, loss_cls: 1.0115, loss_bbox: 1.2634, d0.loss_cls: 1.0053, d0.loss_bbox: 1.2419, d1.loss_cls: 1.0042, d1.loss_bbox: 1.2413, d2.loss_cls: 1.0063, d2.loss_bbox: 1.2585, d3.loss_cls: 1.0101, d3.loss_bbox: 1.2536, d4.loss_cls: 1.0117, d4.loss_bbox: 1.2574, loss: 13.5651, grad_norm: 12.2377
2022-06-19 15:21:30,658 - mmdet - INFO - Exp name: bevformer_tiny.py
2022-06-19 15:21:30,658 - mmdet - INFO - Epoch [1][3000/3517] lr: 2.000e-04, eta: 1 day, 22:46:28, time: 2.032, data_time: 1.407, memory: 8310, loss_cls: 0.9942, loss_bbox: 1.2657, d0.loss_cls: 0.9846, d0.loss_bbox: 1.2573, d1.loss_cls: 0.9865, d1.loss_bbox: 1.2494, d2.loss_cls: 0.9876, d2.loss_bbox: 1.2545, d3.loss_cls: 0.9911, d3.loss_bbox: 1.2612, d4.loss_cls: 0.9924, d4.loss_bbox: 1.2661, loss: 13.4905, grad_norm: 11.0886
2022-06-19 15:23:13,142 - mmdet - INFO - Epoch [1][3050/3517] lr: 2.000e-04, eta: 1 day, 22:44:20, time: 2.050, data_time: 1.435, memory: 8310, loss_cls: 1.0017, loss_bbox: 1.2824, d0.loss_cls: 0.9990, d0.loss_bbox: 1.2610, d1.loss_cls: 1.0002, d1.loss_bbox: 1.2577, d2.loss_cls: 0.9988, d2.loss_bbox: 1.2663, d3.loss_cls: 1.0010, d3.loss_bbox: 1.2726, d4.loss_cls: 1.0015, d4.loss_bbox: 1.2770, loss: 13.6192, grad_norm: 11.0692
2022-06-19 15:24:57,821 - mmdet - INFO - Epoch [1][3100/3517] lr: 2.000e-04, eta: 1 day, 22:43:10, time: 2.094, data_time: 1.481, memory: 8310, loss_cls: 1.0149, loss_bbox: 1.2740, d0.loss_cls: 1.0108, d0.loss_bbox: 1.2555, d1.loss_cls: 1.0108, d1.loss_bbox: 1.2549, d2.loss_cls: 1.0092, d2.loss_bbox: 1.2604, d3.loss_cls: 1.0115, d3.loss_bbox: 1.2656, d4.loss_cls: 1.0129, d4.loss_bbox: 1.2648, loss: 13.6453, grad_norm: 11.6640
2022-06-19 15:26:33,298 - mmdet - INFO - Epoch [1][3150/3517] lr: 2.000e-04, eta: 1 day, 22:38:01, time: 1.909, data_time: 1.287, memory: 8310, loss_cls: 1.0198, loss_bbox: 1.2617, d0.loss_cls: 1.0159, d0.loss_bbox: 1.2424, d1.loss_cls: 1.0159, d1.loss_bbox: 1.2406, d2.loss_cls: 1.0141, d2.loss_bbox: 1.2515, d3.loss_cls: 1.0167, d3.loss_bbox: 1.2524, d4.loss_cls: 1.0181, d4.loss_bbox: 1.2519, loss: 13.6009, grad_norm: 13.3343
2022-06-19 15:28:14,777 - mmdet - INFO - Epoch [1][3200/3517] lr: 2.000e-04, eta: 1 day, 22:35:31, time: 2.030, data_time: 1.405, memory: 8310, loss_cls: 1.0100, loss_bbox: 1.2712, d0.loss_cls: 1.0053, d0.loss_bbox: 1.2570, d1.loss_cls: 1.0025, d1.loss_bbox: 1.2616, d2.loss_cls: 1.0031, d2.loss_bbox: 1.2619, d3.loss_cls: 1.0065, d3.loss_bbox: 1.2661, d4.loss_cls: 1.0080, d4.loss_bbox: 1.2698, loss: 13.6230, grad_norm: 11.6313
2022-06-19 15:29:57,034 - mmdet - INFO - Epoch [1][3250/3517] lr: 2.000e-04, eta: 1 day, 22:33:23, time: 2.045, data_time: 1.418, memory: 8310, loss_cls: 1.0147, loss_bbox: 1.2882, d0.loss_cls: 1.0131, d0.loss_bbox: 1.2560, d1.loss_cls: 1.0126, d1.loss_bbox: 1.2558, d2.loss_cls: 1.0108, d2.loss_bbox: 1.2586, d3.loss_cls: 1.0129, d3.loss_bbox: 1.2631, d4.loss_cls: 1.0134, d4.loss_bbox: 1.2640, loss: 13.6633, grad_norm: 12.2173
2022-06-19 15:31:34,791 - mmdet - INFO - Epoch [1][3300/3517] lr: 2.000e-04, eta: 1 day, 22:29:24, time: 1.955, data_time: 1.331, memory: 8310, loss_cls: 1.0279, loss_bbox: 1.2656, d0.loss_cls: 1.0241, d0.loss_bbox: 1.2463, d1.loss_cls: 1.0229, d1.loss_bbox: 1.2421, d2.loss_cls: 1.0219, d2.loss_bbox: 1.2501, d3.loss_cls: 1.0253, d3.loss_bbox: 1.2568, d4.loss_cls: 1.0264, d4.loss_bbox: 1.2636, loss: 13.6731, grad_norm: 12.3598
2022-06-19 15:33:15,742 - mmdet - INFO - Epoch [1][3350/3517] lr: 2.000e-04, eta: 1 day, 22:26:47, time: 2.019, data_time: 1.389, memory: 8310, loss_cls: 1.0004, loss_bbox: 1.2727, d0.loss_cls: 0.9953, d0.loss_bbox: 1.2480, d1.loss_cls: 0.9965, d1.loss_bbox: 1.2462, d2.loss_cls: 0.9961, d2.loss_bbox: 1.2494, d3.loss_cls: 0.9984, d3.loss_bbox: 1.2605, d4.loss_cls: 0.9991, d4.loss_bbox: 1.2656, loss: 13.5282, grad_norm: 11.8227
2022-06-19 15:34:53,290 - mmdet - INFO - Epoch [1][3400/3517] lr: 2.000e-04, eta: 1 day, 22:22:51, time: 1.951, data_time: 1.326, memory: 8310, loss_cls: 1.0186, loss_bbox: 1.2884, d0.loss_cls: 1.0169, d0.loss_bbox: 1.2603, d1.loss_cls: 1.0149, d1.loss_bbox: 1.2617, d2.loss_cls: 1.0143, d2.loss_bbox: 1.2620, d3.loss_cls: 1.0167, d3.loss_bbox: 1.2707, d4.loss_cls: 1.0187, d4.loss_bbox: 1.2738, loss: 13.7172, grad_norm: 12.4651
2022-06-19 15:36:36,010 - mmdet - INFO - Epoch [1][3450/3517] lr: 2.000e-04, eta: 1 day, 22:21:00, time: 2.054, data_time: 1.428, memory: 8310, loss_cls: 1.0122, loss_bbox: 1.2736, d0.loss_cls: 1.0079, d0.loss_bbox: 1.2588, d1.loss_cls: 1.0062, d1.loss_bbox: 1.2593, d2.loss_cls: 1.0068, d2.loss_bbox: 1.2673, d3.loss_cls: 1.0101, d3.loss_bbox: 1.2706, d4.loss_cls: 1.0117, d4.loss_bbox: 1.2715, loss: 13.6559, grad_norm: 11.6159
2022-06-19 15:38:15,900 - mmdet - INFO - Epoch [1][3500/3517] lr: 2.000e-04, eta: 1 day, 22:18:04, time: 1.998, data_time: 1.377, memory: 8310, loss_cls: 1.0109, loss_bbox: 1.2567, d0.loss_cls: 1.0057, d0.loss_bbox: 1.2514, d1.loss_cls: 1.0041, d1.loss_bbox: 1.2477, d2.loss_cls: 1.0050, d2.loss_bbox: 1.2505, d3.loss_cls: 1.0067, d3.loss_bbox: 1.2544, d4.loss_cls: 1.0083, d4.loss_bbox: 1.2577, loss: 13.5591, grad_norm: 11.9800
2022-06-19 15:38:47,159 - mmdet - INFO - Saving checkpoint at 1 epochs
2022-06-19 16:00:23,571 - mmdet - INFO - Exp name: bevformer_tiny.py
2022-06-19 16:00:23,572 - mmdet - INFO - Epoch(val) [1][3010] pts_bbox_NuScenes/NDS: 0.0330, pts_bbox_NuScenes/mAP: 0.0000

How to set f16

Dear authors, sorry for bothering you again. When I set fp16 = dict(loss_scale=512.), I found I have to also set find_unused_parameters=True, or there will be an error. I wonder is this the right way to use fp16 in your code, because setting find_unused_parameters=True may largely slow down the training.

undefined symbol: _ZNK2at6Tensor6deviceEv

I followed exactly the same installation steps as described here, but when I run python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes --version v1.0 --canbus ./data, i get the following error: undefined symbol: _ZNK2at6Tensor6deviceEv. My cuda version is 11.4 and driver version is 470.129.06

请问一张卡怎么跑前向

waymo dataset evaluation

Hello, thank you for sharing the code! I read your paper and check the code. It is really amazing that it achieve such great results both on nuscene and waymo dataset.
Currently there is only code for nuscene dataset, and I wonder if you can provide configs and plugins for waymo dataset, so that I can reproduce the result on it ?

share detection results for a val set in .JSON file

Hi, could you guys release the detection results of the validation set in the standard .json format? I wish to perform some stat analysis over the results. Thanks.

prepare_dataset

hello,
when i run "python tools/create_data.py nuscenes --root-path ./data/nuScenes --out-dir ./data/nuScenes --extra-tag nuScenes --version v1.0 --canbus ./data "

i had met this problem :
v1.0-trainval ./data/nuScenes

Loading NuScenes tables for version v1.0-trainval...
Killed

anyone met this before and can help?
Thank you very much

'nuscenes_infos_train.pkl' not found

I met 'nuscenes_infos_train.pkl not found' problem when creating data.

I workaround by copying following 4 files from somewhere else.
(nuscenes_infos_train.pkl
nuscenes_infos_train_mono3d.coco.json
nuscenes_infos_val.pkl
nuscenes_infos_val_mono3d.coco.json)

Is it a bug? or do I miss something?

Abotu BEVFormer-S

Thanks for your great job!
I want to training BEVFormer without temporal information, can you tell me how to adjust config.py and code to reproduce the results of BEVFormer-S in your paper?

CustomNuScenesDataset register fail

when runing the test on windows environment, I got this error:

test.py", line 195, in main
dataset = build_dataset(cfg.data.test)

KeyError: 'CustomNuScenesDataset is not in the dataset registry'

looks like this module fail to register.
any idea how to debug this?
(on linux it has no such problem and can successfully run the inference)

Visualization

Hello ,

After installation and data preparation, if just wanna use the pre-trained .pth to see the amazing model, is it possible to do the visualization with nuscense v1.0-mini? If yes, how?

Thank you very much.

what's the meaning of each can_bus item?

There are 18 items in can_bus. Some of them are used in the code, but I can't find the exact meaning for each of them, even after visiting the Nuscenes official web.
So can you give me some guides about how to understand them?

The memory of BEVFormer-base

Dear authors, thanks for your great work! You show in the table that BEVFormer-base's memory is 28000M, but the log shows 18000M. Is this a typo? And when I run the code, the memory is 24400M.

Stacking encoder layers

Hello Zhiqi Li,
Thanks for sharing your work!

I have a question, in the paper you say After the feed-forward network, the encoder layer output the refined BEV features, which is the input of the next encoder layer. After 6 stacking encoder layers, unified BEV features Bt at current timestamp t are generated. I am confused about how these refined BEV features are used in the next iteration(as there are 6 encoder layers) as input.

Are these refined features given as input to the temporal self-attention block from the next encoder? if yes, are they used as BEV queries?

Or are they given to the Spatial Cross-attention block from the next encoder?

about data augmentation

Hi, I notice that BEVFormer does not employ data augmentation(flip, resize) in petr.

I tried to follow petr and bevdet to employ these data augmentations and the performance was not good.
Did you try these operations?

telsa v100，16G memory，cuda out of memory?

telsa v100，16G memory, 8 GPU , training error as follows:

File "/home/admin/ys/BEVFormer/projects/mmdet3d_plugin/bevformer/apis/mmdet_train.py", line 75, in custom_train_detector
model.cuda(),
File "/home/yangsu/.virtualenvs/test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 637, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/yangsu/.virtualenvs/test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 530, in _apply
module._apply(fn)
File "/home/yangsu/.virtualenvs/test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 530, in _apply
module._apply(fn)
File "/home/yangsu/.virtualenvs/test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 530, in _apply
module._apply(fn)
File "/home/yangsu/.virtualenvs/test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 552, in _apply
param_applied = fn(param)
File "/home/yangsu/.virtualenvs/test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 637, in
return self._apply(lambda t: t.cuda(device))
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 3 (pid: 2766037) of binary: /home/yangsu/.virtualenvs/test/bin/python

I look forward your apply,thank you

torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

hi,zhiqi, i wish you all the best. when i use the pre_trained model in v1.0-mini dataset, i got this error:
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
and i do not know how to fix it. could you please explain a little bit? i use 1 gpu, command is
./tools/dist_test.sh ./projects/configs/bevformer/bevformer_base.py ckpts/r101_dcn_fcos3d_pretrain.pth 1 : )

请问能在nuscenes 的min数据集合上进行测试吗？

Are reference points predefined?

Hello, I'm wondering if the reference points presented in Spatial Cross-Attention (Section 3.3) are predefined, and not adapted to different images?

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 32743) of binary

hello! I replace "--eval bbox" with "--out ../test.pkl" in the file dist_test.sh。 and I use just 1 gpu to run test.py.

writing results to ../test.pkl
Traceback (most recent call last):
File "./tools/test.py", line 262, in
main()
File "./tools/test.py", line 240, in main
assert False
AssertionError
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 32743) of binary: /home/lxx/anaconda3/envs/bevformer/bin/python
ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed
INFO:torch.distributed.elastic.agent.server.api:[default] Worker group FAILED. 3/3 attempts left; will restart worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Stopping worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:

Question about with_box_refine in BEVFormerHead

According to my experiment result, the static prediction result of with_box_refine=False is better than that of with_box_refine=True. mAP of with_box_refine=False is 0.3832, and mAP of with_box_refine=True is 0.3758. I didn't see any related discussion on this comparison in the paper. Is there any explanation for this? Thank you!

How can I visualize the val result？

when running eval , the results are not good ,which are marked difference with your BEVFormer.mp4.

I execute program like this ： ./tools/dist_test.sh ./projects/configs/bevformer/bevformer_base.py ./ckpts/bevformer_r101_dcn_24ep.pth 1 the weight is provided by your link .but the result are not good, there are some NP objects in person class in CAM_FRONT_LEFT images. which reason can make such bad results?

Robustness on Camera Extrinsics

Hello, thank you for sharing the code! I read your paper and check the code.In your paper, you set subjected to different levels of camera extrinsics noises. I want to know how do you set set subjected to different levels of camera extrinsics noises in code.

About the training time with V100 and A100

Hi, dear author, I use 8 32GV100 to train bevformer_base, but the training progress seems too slow, which needs 60days. Have you trained bevformer_base on 8 V100? How long do you need to train on 8 V100 or 8 A100? Thanks!

Why use self.train() in the for loop in obtain_history_bev

In the obtain_history_bev, self.train() is called in the for loop. Is that correct?

def obtain_history_bev(self, imgs_queue, img_metas_list):
        """Obtain history BEV features iteratively. To save GPU memory, gradients are not calculated.
        """
        self.eval()
        with torch.no_grad():
            len_queue = imgs_queue.size(1)
            prev_bev = None
            for i in range(len_queue):
                img = imgs_queue[:, i, ...]
                img_metas = [each[i] for each in img_metas_list]
                img_feats = self.extract_feat(img=img, img_metas=img_metas)
                prev_bev = self.pts_bbox_head(
                    img_feats, img_metas, prev_bev, only_bev=True)
                self.train()
            return prev_bev

install meet error :KeyError: "There is no item named 'nbconvert/tests/exporter_entrypoint/eptest-0.1.dist-info/WHEEL' in the archive"

Hello,
is there anyone meet this error during installation?
KeyError: "There is no item named 'nbconvert/tests/exporter_entrypoint/eptest-0.1.dist-info/WHEEL' in the archive"

how would you solve it ?

some confusions in reproduce the bevformer-s

Hi zhiqi,
I am so instresting in the excellect job, eg bevformer. First step i want to reproduce the bevformer-static(means only use bev encoder) in your paper, but there are some confusions in the process. The model cfg is backbone: Resnet101(pretrained by Fcos3D), neck: fpn( with sizes of 1/16, 1/32,1/64), bev encoder: 6 * transformer(deformable self-atten in bev query and deformable cross-atten with 6 * camera features), detect head: 6 * transofrmer(deformable detr with 900 object query). All embed dim in attention is 256 and channel size in ffn is 512 by default.
Fristly, in your paper i found that the memory is around 20G（bev anchor size：200 * 200 * 4）, but in my implementation the memory is 32G（when bev anchor size：100 * 100 * 4）. Is there some problems with my configuration?
Second, in my implementation the speed of convergence the bevformer-static(bev anchor size：100 * 100 * 4) is more slower than detr3d with the same loss（l1 and focal loss）, and the final result mAP in nuscenes val set is only around 31.1.
In the meanwhile i have check the valid bev anchor and bev anchor transformation point in multi-view images，the pink points in bev figure means front view, we only visualize the points are valid in more than one image to check the bev anchor transformation. It seems that there are no obvious problems in bev anchor transformation. Can you help me to reproduce the result in your paper？

Thank you so much!

请问在哪里设置训练batch_size

segmentation code

thank you for your great work! i want to know if you have the plan to release the part of segmentation code?
look forward for you reply~

Questions about BEV query and object query.

Thanks for sharing this great work!

I have questions about BEV query and object query (for detection).
As mentioned in the paper, the number of BEV Queries is 200 x 200, for object Queries its 900.
Also, 1x Spatial cross-attention and 6x temporal self-attention are attention layers working with BEV queries for the current frame Bev queries generation.

Where are the cross-attention layers between BEV Queries and object Queries?
Is that in the detection head, are there still 6 cross-attention layers for the interaction between BEV Queries and object Queries?

Some questions about the Spatial Cross-Attention

Hi, thanks for your great work!

In your paper, you mention that: For one BEV query, the projected 2D points can only fall on some views, and other views are not hit.
How do you determine which views will the reference points fall on?
And I am confused about the equation (4). What does the zij · [xij yij 1] mean? Is the zij a typo?

Hopefully, you can offer some explanations. Thanks~

segmentation code release

First of all, thanks for sharing a great baseline to the community. Are you also planning to release codes for segmentation soon?

pdb单步调试无法停在断点处

tools/dist_train.sh脚本中加上pdb：
python -m pdb -m torch.distributed.launch ...

然后运行：
./tools/dist_train.sh ./projects/configs/bevformer/bevformer_base.py 1

加断点并运行：

(Pdb) b ./tools/train.py:145
Breakpoint 1 at /home/xuan/code/pytorch/BEVFormer/tools/train.py:145
(Pdb) r

无法停在断点处，而是直接就训练了，请问这个怎么单步调式？

Training error - Function not implemented: 'epoch_1.pth'

Hello,

When I ran training I got below error. I am running from Ubuntu 18.04 with 8 RTX A6000 and follow the instructions. This error happened after training for some time. Is there a way to work around this?

./tools/dist_train.sh ./projects/configs/bevformer/bevformer_base.py 8
.....

7020, d0.loss_cls: 0.3383, d0.loss_bbox: 0.7975, d1.loss_cls: 0.3330, d1.loss_bbox: 0.7195, d2.loss_cls: 0.3324, d2.loss_bbox: 0.7033, d3.loss_cls: 0.3333, d3.loss_bbox: 0.7003, d4.loss_cls: 0.3371, d4.loss_bbox: 0.7010, loss: 6.3380, grad_norm: 33.4391
2022-06-29 21:27:48,563 - mmdet - INFO - Saving checkpoint at 1 epochs
Traceback (most recent call last):
File "./tools/train.py", line 259, in
main()
File "./tools/train.py", line 248, in main
custom_train_model(
File "/media/dataset4/BEVFormer/projects/mmdet3d_plugin/bevformer/apis/train.py", line 26, in custom_train_model
custom_train_detector(
File "/media/dataset4/BEVFormer/projects/mmdet3d_plugin/bevformer/apis/mmdet_train.py", line 179, in custom_train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
self.call_hook('after_train_epoch')
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
getattr(hook, fn_name)(self)
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/hooks/checkpoint.py", line 116, in after_train_epoch
self._save_checkpoint(runner)
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/dist_utils.py", line 94, in wrapper
return func(*args, **kwargs)
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/hooks/checkpoint.py", line 121, in _save_checkpoint
runner.save_checkpoint(
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 175, in save_checkpoint
mmcv.symlink(filename, dst_file)
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmcv/utils/path.py", line 36, in symlink
os.symlink(src, dst, **kwargs)
OSError: [Errno 38] Function not implemented: 'epoch_1.pth' -> '/media/dataset4/BEVFormer/work_dirs/bevformer_base/latest.pth'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1872990) of binary: /home/user01/anaconda3/envs/open-mmlab/bin/python
Traceback (most recent call last):
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/distributed/run.py", line 689, in run
elastic_launch(
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 116, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/user01/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

    ./tools/train.py FAILED

How to visulise the attention map of Bevformer?

Hello! Thank you for your great work!
If I want to visualize the attention map of the pretrained model, how could I do that for the multi-view-image. Could we have relevant code implementation?

about projection of Qp in BEV onto a small feature map

Thank you for your working.

What method are you using when projecting query Qp in BEV onto the small feature map obtained by FPN?

Intuitively, I don't think that can project that with a 3x4 projection matrix because the small feature map does not have the location information of the original image.

batch size setting

Hello, I want to know how to change the batch size. I meet with the CUDA out of memory when I use 1 GPU for training.

Evaluation fails: Exception('Error: Invalid box type: %s' % box)

I'm trying to run the evaluation part based on the the provided checkpoints file
(./tools/dist_test.sh ./projects/configs/bevformer/bevformer_base.py ./ckpts/r101_dcn_fcos3d_pretrain.pth 1).
The bbox formating and annotations loading seem to work fine, but the evalution process stops due to a probelm in predictions filtering:

Formating bboxes of pts_bbox
Start to convert detection format...
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 6019/6019, 14298.1 task/s, elapsed: 0s, ETA: 0s
Results writes to test/bevformer_base/Mon_Jun_27_22_31_13_2022/pts_bbox/results_nusc.json
Evaluating bboxes of pts_bbox

Loading NuScenes tables for version v1.0-trainval...
23 category,
8 attribute,
4 visibility,
64386 instance,
12 sensor,
10200 calibrated_sensor,
2631083 ego_pose,
68 log,
850 scene,
34149 sample,
2631083 sample_data,
1166187 sample_annotation,
4 map,
Done loading in 30.495 seconds.

Reverse indexing ...
Done reverse indexing in 7.1 seconds.

Initializing nuScenes detection evaluation
Loaded results from test/bevformer_base/Mon_Jun_27_22_31_13_2022/pts_bbox/results_nusc.json. Found detections for 6019 samples.
Loading annotations for val split from nuScenes version: v1.0-trainval
100%|████████████████████████████████████████████████████████████████████████████████████| 6019/6019 [00:07<00:00, 835.78it/s]
Loaded ground truth annotations for 6019 samples.
Filtering predictions
Traceback (most recent call last):
File "./tools/test.py", line 262, in
main()
File "./tools/test.py", line 258, in main
print(dataset.evaluate(outputs, **eval_kwargs))
File "/home/eeproj4/Alex/mmdetection3d/mmdet3d/datasets/nuscenes_dataset.py", line 505, in evaluate
ret_dict = self._evaluate_single(result_files[name])
File "/home/eeproj4/Alex/mmdetection3d/BEVFormer/projects/mmdet3d_plugin/datasets/nuscenes_dataset.py", line 230, in _evaluate_single
self.nusc_eval = NuScenesEval_custom(
File "/home/eeproj4/Alex/mmdetection3d/BEVFormer/projects/mmdet3d_plugin/datasets/nuscnes_eval.py", line 570, in init
self.pred_boxes = filter_eval_boxes(nusc, self.pred_boxes, self.cfg.class_range, verbose=verbose)
File "/home/eeproj4/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/nuscenes/eval/common/loaders.py", line 219, in filter_eval_boxes
class_field = _get_box_class_field(eval_boxes)
File "/home/eeproj4/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/nuscenes/eval/common/loaders.py", line 283, in _get_box_class_field
raise Exception('Error: Invalid box type: %s' % box)
Exception: Error: Invalid box type: None
Killing subprocess 490781
Traceback (most recent call last):
File "/home/eeproj4/miniconda3/envs/open-mmlab/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/eeproj4/miniconda3/envs/open-mmlab/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/eeproj4/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/home/eeproj4/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/home/eeproj4/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/eeproj4/miniconda3/envs/open-mmlab/bin/python', '-u', './tools/test.py', '--local_rank=0', './projects/configs/bevformer/bevformer_base.py', './ckpts/r101_dcn_fcos3d_pretrain.pth', '--launcher', 'pytorch', '--eval', 'bbox']' returned non-zero exit status 1

Did anyone here encounter this problem and managed to solve it?
Many thanks

visual is incorrect in analysis_tools

请问你们训练用的啥gpu配置？

28gb显存训练感人。