Git Product home page Git Product logo

parse_graph's Introduction

A Bottom-up Framework for Construction of Structured Semantic 3D Scene Graph

This work is based on our paper (IROS 2020, accepted). We proposed a new framework to construct 3D Structured Semantic Scene Graph. Our work is based on YOLO (V3), Apriltag and ORB-SLAM2, implemented in ROS.

Author: Bangguo Yu, Chongyu Chen, Fengyu Zhou, Fang Wan, Wenmi Zhuang and Yang Zhao

Affiliation: Shandong University and DarkMatter AI Research

3D Structured Semantic Scene Graph Framework

The input of the architecture includes the sequence of RGB-D images and scene priors of an improved S-AOG structure. In the first step, we use yolov-v3 and apriltag to recognize object information. Limited by the recognition algorithm for big objects like desk or room, we adopted the apriltag to detect them. After object recognition, the 3D bounding box is estimated by the height and width of 2D bounding box and depth information roughly. Specifically, we use ORB-SLAM2 to estimate the position. The relation extraction module is designed to obtain the relations between objects using 3D position and 3D bounding box. The attribute extraction module is designed to focus on the 3D position, size and color. Then using scene priors, a parse graph pg can be calculated for each image by maximizing a posteriori estimation (MAP), which is the best explanation for extracted objects, attributes and relations. The output of MAP is added to keyframe group filter (KGF). KGF module is designed to remove repeated and fake detections. After KGF optimization, parse graph is added to local scene graphs and displayed in real-time. Globe scene graph can be obtained by updating and merging the local scene graphs

image-20200706200822807

Requirements

Installation

The code has been tested only with Python 2.7, CUDA 9.0 on Ubuntu 16.04.

  1. Build a catkin_workspace

    mkdir -p ~/catkin_ws/src
    cd ~/catkin_ws/src/
  2. Install YOLO (V3), Apriltag and ORB-SLAM2 repository in this workspace.

  3. Download and build 3D Structured Semantic Scene Graph repository:

    git clone https://github.com/ybgdgh/parse_graph.git
    catkin_init_workspace
    cd ..
    catkin_make -DCMAKE_BUILD_TYPE=Release
    echo "source ~/catkin_ws/devel/setup.bash" >> ~/.bashrc
    source ~/.bashrc

Download dataset

The scene dataset:

rosbag

The scene prior:

scene priors

Result

image-20200706211403321!image-20200706211403321!

scene_graph_update

Demo Video

video

parse_graph's People

Contributors

ybgdgh avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.