Git Product home page Git Product logo

awesome-occupancy-perception's Introduction

Awesome-occupancy-perception

This repository is a paper digest of recent advances in occupancy perception.

本仓库由公众号【自动驾驶之心】 团队整理,欢迎关注,一览最前沿的技术分享!

欢迎学习自动驾驶之心出品的 Occupancy从入门到精通全栈教程

Paper List

2024

InverseMatrixVT3D: An Efficient Projection Matrix-Based Approach for 3D Occupancy Prediction

[paper] [code]

[NeurIPS 2023] POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images

[paper] [code]

UniVision: A Unified Framework for Vision-Centric 3D Perception

[paper] [code]

2023

Fully Sparse 3D Panoptic Occupancy Prediction

[paper]

[AAAI 2024] RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation

[paper]

[AAAI 2024] Regulating Intermediate 3D Features for Vision-Centric Autonomous Driving

[paper] [code]

OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields

[paper] [code]

Camera-based 3D Semantic Scene Completion with Sparse Guidance Network

[paper] [code]

OctreeOcc: Efficient and Multi-Granularity Occupancy Prediction Using Octree Queries

[paper]

COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction

[paper]

Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications

[paper] [code]

DepthSSC: Depth-Spatial Alignment and Dynamic Voxel Resolution for Monocular 3D Semantic Scene Completion

[paper]

OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving

[paper] [code]

Technical Report for Argoverse Challenges on 4D Occupancy Forecasting

[paper]

SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction

[paper] [code]

FlashOcc: Fast and Memory-Efficient Occupancy Prediction via Channel-to-Height Plugin

[paper]

SOccDPT: Semi-Supervised 3D Semantic Occupancy from Dense Prediction Transformers trained under memory constraints

[paper]

EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision

[paper] [code]

Alleviating Foreground Sparsity for Semi-Supervised Monocular 3D Object Detection

[paper] [code]

LiDAR-based 4D Occupancy Completion and Forecasting

[paper] [code]

S4C: Self-Supervised Semantic Scene Completion with Neural Fields

[paper] [code]

Predicting Future Spatiotemporal Occupancy Grids with Semantics for Autonomous Driving

[paper]

Scene Informer: Anchor-based Occlusion Inference and Trajectory Prediction in Partially Observable Environments

[paper] [code]

[ICRA2024] PointSSC: A Cooperative Vehicle-Infrastructure Point Cloud Benchmark for Semantic Scene Completion

[paper]

OCC-VO: Dense Mapping via 3D Occupancy-Based Visual Odometry for Autonomous Driving

[paper] [code]

SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous Driving

[paper]

RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision

[paper] [code]

OccupancyDETR: Making Semantic Scene Completion as Straightforward as Object Detection

[paper] [code]

[ITSC 2023] Connected Autonomous Vehicle Motion Planning with Video Predictions from Smart, Self-Supervised Infrastructure

[paper]

PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic Occupancy Prediction

[paper] [code]

[AAAI2024] SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection

[paper] [code]

[ICCV 2023] MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection

[paper] [code]

UniWorld: Autonomous Driving Pre-training via World Models

[paper] [code]

[IROS 2023] Vehicle Motion Forecasting using Prior Information and Semantic-assisted Occupancy Grid Maps

[paper]

Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving

[paper] [code]

FusionAD: Multi-modality Fusion for Prediction and Planning Tasks of Autonomous Driving

[paper]

OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured Traffic Scenarios

[paper] [code]

CVSformer: Cross-View Synthesis Transformer for Semantic Scene Completion

[paper]

Parametric Depth Based Feature Representation Learning for Object Detection and Segmentation in Bird's Eye View

[paper]

[CVPR 2023 Challenge] FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation

[paper] [code]

[TIV] LXL: LiDAR Excluded Lean 3D Object Detection with 4D Imaging Radar and Camera Fusion

[paper]

Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

[paper] [code]

[CVPR 2023 Challenge] Multi-Scale Occ: 4th Place Solution for CVPR 2023 3D Occupancy Prediction Challenge

[paper]

[CVPR 2023 Challenge] UniOcc: Unifying Vision-Centric 3D Occupancy Prediction with Geometric and Semantic Rendering

[paper]

UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction

[paper] [code]

Learning Occupancy for Monocular 3D Object Detection

[paper] [code]

[CVPR 2023] GeoMAE: Masked Geometric Target Prediction for Self-supervised Point Cloud Pre-Training

[paper] [code]

[ITSC 2023] Occupancy Prediction-Guided Neural Planner for Autonomous Driving

[paper] [code]

A Simple Framework for 3D Occupancy Estimation in Autonomous Driving

[paper] [code]

[ICCV 2023] OccNet: Scene as Occupancy

[paper] [code]

[IROS 2023] SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation Separation and BEV Fusion

[paper] [code]

SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving

[paper] [code]

OVO: Open-Vocabulary Occupancy

[paper] [code]

PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation

[paper] [code]

BEVDet-Occ

[code]

Occ-BEV: Multi-Camera Unified Pre-training via 3D Scene Reconstruction

[paper] [code]

BEV-IO: Enhancing Bird's-Eye-View 3D Detection with Instance Occupancy

[paper]

Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving

[paper] [code]

OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction

[paper] [code]

StereoScene: BEV-Assisted Stereo Matching Empowers 3D Semantic Scene Completion

[paper] [code]

SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving

[paper] [code]

OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception

[paper] [code]

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

[paper] [code]

[CVPR 2023] Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction

[paper] [code]

[CVPR 2023] Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting

[paper] [code]

OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion

[paper] [code]

Diffusion Probabilistic Models for Scene-Scale 3D Categorical Data

[paper]

[ICRA 2023] StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks

[paper] [code]

2022

[CVPR 2022] MonoScene: Monocular 3D Semantic Scene Completion

[paper] [code]

[ECCV 2022] Differentiable Raycasting for Self-supervised Occupancy Forecasting

[paper] [code]

LOPR: Latent Occupancy PRediction using Generative Models

[paper] code

[IROS 2022] Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

[paper]

STrajNet: Multi-modal Hierarchical Transformer for Occupancy Flow Field Prediction in Autonomous Driving

[paper] [code]

HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow Prediction

[paper]

Dynamic Semantic Occupancy Mapping using 3D Scene Flow and Closed-Form Bayesian Inference

[paper]

[IEEE Robotics and Automation Letters] Occupancy Flow Fields for Motion Forecasting in Autonomous Driving

[paper]

2021

[AAAI 2021] Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion

[paper] [code]

2020

[ CVPR 2020] 3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior

[paper] [code]

[CVPR 2020] Anisotropic Convolutional Networks for 3D Semantic Scene Completion

[paper] [code]

[3DV 2020] LMSCNet: Lightweight Multiscale 3D Semantic Completion

[paper] [code]

2016

Semantic Scene Completion from a Single Depth Image

[paper]

Survey

2023

Grid-Centric Traffic Scenario Perception for Autonomous Driving: A Comprehensive Review

[paper]

Open source projects

TorchSSC

[projects]

OpenOcc

[projects]

The-Eyes-Have-It

[projects]

Challenge

Autonomous Challenge Driving Track 3 3D Occupancy Prediction

[Link]

自动驾驶之心论文解读

清华大学&英伟达最新|Occ3D:通用全面的大规模3D Occupancy预测基准

[link]

Occupancy Network综述!Grid-Centric的感知方法(BEV/多任务/轨迹预测等)

[link]

Postscript

This repository was mainly written by Rujia Wang.

If you have any questions about the paper list, please do not hesitate to email me or open an issue on GitHub.

自动驾驶学习社区

自动驾驶之心知识星球是过国内首个以自动驾驶技术栈为主线的交流学习社区(也是国内最大哦),这是一个前沿技术发布和学习的地方!我们汇总了自动驾驶感知(BEV、多模态感知、Occupancy、毫米波雷达视觉感知、车道线检测、3D感知、目标跟踪、多模态、多传感器融合、Transformer等)、自动驾驶定位建图(在线高精地图、高精地图、SLAM)、多传感器标定(Camera/Lidar/Radar/IMU等近20种方案)、Nerf、视觉语言模型、世界模型、规划控制、轨迹预测、领域技术方案、AI模型部署落地等几乎所有子方向的学习路线!

除此之外,还和数十家自动驾驶公司建立了内推渠道,简历直达!这里可以自由提问交流,许多算法工程师和硕博日常活跃,解决问题!初衷是希望能够汇集行业大佬的智慧,在学习和就业上帮到大家!星球的每周活跃度都在前50内,非常注重大家积极性的调度和讨论,欢迎加入一起成长!

加入链接:自动驾驶之心知识星球 | 国内首个自动驾驶全栈学习社区,近30+感知/融合/规划/标定/预测等学习路线

awesome-occupancy-perception's People

Contributors

shenxiaowrj avatar autodriving-heart avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.