This repository is a paper digest of recent advances in collaborative / cooperative / multi-agent perception for V2I / V2V / V2X autonomous driving scenario.
collaborative_perception's Introduction
Collaborative Perception
This repository is a paper digest of recent advances in collaborative / cooperative / multi-agent perception for V2I / V2V / V2X autonomous driving scenario. Papers are listed in alphabetical order of the first character.
(Talk) Robust Collaborative Perception against Communication Interruption [video], Uncertainty Quantification of Collaborative Detection for Self-Driving [video], Collaborative and Adversarial 3D Perception for Autonomous Driving [video], Vehicle-to-Vehicle Communication for Self-Driving [video], Adversarial Robustness for Self-Driving [video], 2022 1st Cooperative Perception Workshop Playback [video], ๅบไบ็พคไฝๅไฝ็่ถ ่ง่ทๆๅฟๆ็ฅ [video], ๅๅ่ชๅจ้ฉพ้ฉถ๏ผไปฟ็ไธๆ็ฅ [video], ๆฐไธไปฃๅไฝๆ็ฅWhere2commๅๅฐ้ไฟกๅธฆๅฎฝๅไธๅ [video], ๅบไบV2X็ๅคๆบๅๅๆ็ฅๆๆฏๅๆข [video], ้ขๅ่ฝฆ่ทฏๅๅ็็พคๆบๆบๅจ็ฝ็ป [video], IACS 2023 ๅๅๆ็ฅPhD Sharing [video], CICV 2022 ๆฐๆฎ้ฉฑๅจ็่ฝฆ่ทฏๅๅไธ้ข [video]
(Survey) Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges [paper], A Survey and Framework of Cooperative Perception: From Heterogeneous Singleton to Hierarchical Cooperation [paper]
(Library) OpenCOOD: Open Cooperative Detection Framework for Autonomous Driving [code] [doc], CoPerception: SDK for Collaborative Perception [code] [doc]
(People) Runsheng Xu@UCLA [web], Yiming Li@NYU [web], Hang Qiu@Waymo [web]
(Workshop) ICRA 2023 [web], MFI 2022 [web], ITSC 2020 [web]
(Background) Current Approaches and Future Directions for Point Cloud Object Detection in Intelligent Agents [video], 3D Object Detection for Autonomous Driving: A Review and New Outlooks [paper], DACOM: Learning Delay-Aware Communication for Multi-Agent Reinforcement Learning [video], A Survey of Multi-Agent Reinforcement Learning with Communication [paper]
The results above are directly borrowed from publicly accessible papers. Since some of the results here are reported by the following papers instead of the original ones, the most reliable data source links are also given. The best effort is tried to ensure that all the collected benchmark results are in the same training and testing settings (if provided).
In Joint Set evaluation, the OPV2V test split (16 scenes), OPV2V test culver city split (4 scenes), OPV2V validation split (9 scenes), V2XSet test split (19 scenes) and V2XSet validation split (6 scenes) are combined together as a much larger evaluation dataset (totaling 54 different scenes) to allow more stable ranking. The evaluated models are trained on a joint set of OPV2V train split and V2XSet train split with ego vehicle shuffling to augment the data.
By default, the message is broadcasted to all agents to form a fully connected communication graph. Considering collaboration efficiency and bandwidth constraint, Who2com, When2com and Where2comm further apply different strategies to prune the fully connected communication graph into a partially connected one during inference. Both fully connected mode and partially connected mode are evaluated here and the latter is marked in italic.
For fair comparison, all methods adopt the identical one-stage training settings in ideal scenarios (i.e., no pose error or time delay) without weight fine-tuning and message compression, extra fusion modules (e.g., down-sampling convolution layers) of intermediate collaboration mode are simplified if not necessary to mitigate the concern about the actual performance gain. PointPillar is adopted as the backbone for all reproduced methods.
Though the reproduction process is simple and quick (the whole round takes less than 2 days with only two 3090 GPUs), multiple advanced training strategies are applied, which may boost some performance and make the ranking not aligned with the original reports. The reproduction is just a straightforward and fair evaluation for representative collaborative perception methods. To know how the official results are obtained, please refer to the papers or codes collected below for more details, which could be helpful.
๐Dataset and Simulator
CVPR 2023:tada::tada::tada:
V2V4Real (V2V4Real: A Large-Scale Real-World Dataset for Vehicle-to-Vehicle Cooperative Perception) [paper] [code] [project]
V2X-Seq (V2X-Seq: The Large-Scale Sequential Dataset for the Vehicle-Infrastructure Cooperative Perception and Forecasting) [paper] [code] [project]
ICRA 2023
DAIR-V2X-C (Robust Collaborative 3D Object Detection in Presence of Pose Errors) [paper] [code] [project]