Git Product home page Git Product logo

textslam's Introduction

TextSLAM: Visual SLAM with Semantic Planar Text Features

Authors: Boying Li, Danping Zou, Yuan Huang, Xinghan Niu, Ling Pei and Wenxian Yu.

🏠 [Project]   📝 [Paper]   ➡️ [Dataset]   🔧 [Extra Evaluation Tool]

Motivation:

⭐ TextSLAM is a novel visual Simultaneous Localization and Mapping system (SLAM) tightly coupled with semantic text objects.

💡 Humans can read texts and navigate complex environments using scene texts, such as road markings and room names. why not robots?

⭐ TextSLAM explores scene texts as the basic feature both geometrically and semantically. It achieves superior performance even under challenging environments, such as image blurring, large viewpoint changes, and significant illumination variations (day and night).

This repository provides C++ implementation of TextSLAM system.

Overview of TextSLAM

Our accompanying videos are now available on YouTube (click below images to open) and Bilibili1-outdoor, 2-night, 3-rapid.

video video video

⭐ Please consider citing the following papers in your publications if the project helps your work.

@article{li2023textslam,
  title={TextSLAM: Visual SLAM with Semantic Planar Text Features},
  author={Li, Boying and Zou, Danping and Huang, Yuan and Niu, Xinghan and Pei, Ling and Yu, Wenxian},
  booktitle={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
  year={2023}
}

@inproceedings{li2020textslam,
  title={TextSLAM: Visual SLAM with Planar Text Features},
  author={Li, Boying and Zou, Danping and Sartori, Daniele and Pei, Ling and Yu, Wenxian},
  booktitle={IEEE International Conference on Robotics and Automation (ICRA)},
  year={2020}
}

Getting Start

Dataset Download

Download the dataset from TextSLAM Dataset.

1. Prerequisites

TextSLAM is run in Ubuntu 16.04. It should be easy to compile in other Linux system versions.

1.1 Ceres & Eigen3

Refer to Ceres for installing it in Linux.

During the above process, Eigen3 is also installed at the same time.

1.2 OpenCV

We use OpenCV 3.3.1 for image processing.

You can use the OpenCV library provided by ROS. Remember to set OpenCV_DIR in CMakeLists.txt using set(OpenCV_DIR [ros_direction]/share/OpenCV-3.3.1-dev).

You can also refer to OpenCV to download and install the library.

1.3 EVO (Evaluation)

EVO is used for SLAM results evaluation. Refer to EVO to install this evaluation tool.

2. Build and Run

2.1 Clone the repository and build the project:

git clone https://github.com/SJTU-ViSYS/TextSLAM.git
mkdir build
cd build
cmake ..
make -j

Above procedure will create an executable named TextSLAM.

2.2 Run TextSLAM with:

./TextSLAM [yaml_path]/[yaml_name].yaml

We provide yaml files (GeneralMotion.yaml,AIndoorLoop.yaml, LIndoorLoop.yaml, Outdoor.yaml) for our 4 kinds of experiments. Write your sequence save path in 'Exp read path:' of the yaml file.

Refer to TextSLAM Dataset for a detail yaml file structure.

2.3 Output: keyframe_latest.txt will output to record each keyframe pose estimation results in the current station. keyframe.txt will output when finishing a sequence. Both keyframe_latest.txt and keyframe.txt are in TUM format with timestamp tx ty tz qx qy qz qw.

3. Evaluation

We use EVO to evaluate the SLAM performance.

For APE evaluation:

evo_ape tum gt.txt text.txt -va -s

For RPE evaluation at the uint of 1.0 m:

evo_rpe tum gt.txt text.txt -va -s --pose_relation trans_part -d 1.0 -u m

For the loop tests in a large indoor scene, add --n_to_align XX to align the first XX pose of the whole trajectory. Because GT for this sequence is only at the beginning and the end, using the alignment for the first poses will get the more correct results.

evo_ape tum gt.txt text.txt -va -s --n_to_align XX
evo_rpe tum gt.txt text.txt -va -s --pose_relation trans_part -d 1.0 -u m --n_to_align XX

ATTENTION for RPE evaluation:

EVO does not automatically rectify the misalignment between the SLAM body frame and the ground-truth body, which influences RPE results.

To solve this problem, we provide an extra Evaluation tool for TextSLAM dataset, which also served as a supplement for EVO.

Following the instruction of the extra Evaluation tool to first obtain the updated pose ground truth file, and then use the updated GT file to evaluate the RPE results.

This step is necessary for all data except outdoor sequences. We use COLMAP to generate outdoor sequences' ground truth, which generates the same ground truth frame as the SLAM estimated body frame.

Acknowledgement

The authors thank ORB-SLAM, DSO, and AttentionOCR for their excellent works. The authors thank EVO for providing this convenient evaluation tool. The authors thank Ceres for providing this powerful optimization library.

textslam's People

Contributors

leeby68 avatar sjtu-visys avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

textslam's Issues

可视化界面Pangolin无法启用

我已经安装好了可视化界面Pangolin,但是在运行TextSLAM的时候并不会自动启用Pangolin,是需要做其他操作吗?麻烦解答一个小白不入门的问题,谢谢

Process FPS

How is the frame per second of the project? In my computer just 0.2 FPS, very slow.

I am trying to find how to accelerate.

Version of Ceres

In the README.md of the project, the version of Ceres isn't specified, and the download link for Ceres will guide you to download Ceres 2.2.0. However, there may be some errors when using 'make' with Ceres 2.2.0 (part of the error is shown at the end of this issue, there may be some errors with 'new ceres::QuaternionParameterization()' in Ceres 2.2.0). I found that such errors will not occur if I use Ceres 2.0.0.

So I suggest specifying that the version of Ceres should be 2.0.0 in the README.md.

/textslam/TextSLAM/src/optimizer.cc:653:48: error: expected type-specifier
653 | problem.AddParameterBlock(Sim3Pose, 4, new ceres::QuaternionParameterization());
| ^~~~~

Guidance may be unclear in the Evaluation section of README.md

In the Evaluation section of README.md, the example code may be unclear. For instance, evo_ape tum gt.txt text.txt -va -s is provided without an explanation of text.txt. While gt.txt can be found in each Seq, text.txt cannot be located anywhere within the project.

I speculate that text.txt refers to the output file keyframe.txt. Therefore, I recommend clarifying this information and including an explanation of text.txt in README.md for user-friendliness.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.