Git Product home page Git Product logo

mtcnn_facedetection_tensorrt's Introduction

blob# MTCNN_TensorRT

MTCNN Face detection algorithm's C++ implementation with NVIDIA TensorRT Inference acceleration SDK.

This repository is based on https://github.com/AlphaQi/MTCNN-light.git

Notations

2018/11/14: I have ported most of the computing to GPU using OpenCV CUDA warper and CUDA kernels wrote by myself. See branch all_gpu for more details, note that you need opencv 3.0+ built with CUDA support to run the projects. The speed is about 5-10 times faster on my GTX1080 GPU than master branch.

2018/10/2: Good news! Now you can run the whole MTCNN using TenorRT 3.0 or 4.0!

I adopt the original models from offical project https://github.com/kpzhang93/MTCNN_face_detection_alignment and do the following modifications: Considering TensorRT don't support PRelu layer, which is widely used in MTCNN, one solution is to add Plugin Layer (costome layer) but experiments show that this method breaks the CBR process in TensorRT and is very slow. I use Relu layer, Scale layer and ElementWise addition Layer to replace Prelu (as illustrated below), which only adds a bit of computation and won't affect CBR process, the weights of scale layers derive from original Prelu layers.

modification

Required environments

  1. OpenCV (on ubuntu just run sudo apt-get install libopencv-dev to install opencv)
  2. CUDA 9.0
  3. TensorRT 3.04 or TensorRT 4.16 (I only test these two versions)
  4. Cmake >=3.5
  5. A digital camera to run camera test.

Build

  1. Replace the tensorrt and cuda path in CMakeLists.txt
  2. Configure the detection parameters in mtcnn.cpp (min face size, the nms thresholds , etc)
  3. Choose the running modes (camera test or single image test)
  4. cmake .
  5. make -j
  6. ./main

Results

The result will be like this in single image test mode:

single

Speed

On my computer with nvidia-gt730 grapic card (its performance is very very poor) and intel i5 6500 cpu, when the min face-size is set to 60 pixels, the above image costs 20 to 30ms.

TODO

Inplement the whole processing using GPU computing.

mtcnn_facedetection_tensorrt's People

Contributors

pkuzhou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mtcnn_facedetection_tensorrt's Issues

few quotations on gpu version

First of all good job on MTCNN GPU version!

why the image size need to configure up front in the constructor?
is there a resin why you comment the mtcnn::~mtcnn()?

输出信息

你好,我这边有一个问题:除了输出bounding box和5个关键点,还有没有其他可供使用的输出信息?

Intuition of replacing PReLU

Can you explain how is Scaling, applying ReLU and then again Scaling and elementwise addition equivalent to PReLU?

bug the sample with image is not working

I have a newly install jetpack4.5 on jetson nano
it is basicly ubuntu 18.04 aarch64with all nvidia stuff cuda tensorrt etc

Start generating TenosrRT runtime models
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::replace: __pos (which is 15) > this->size() (which is 0)
The program has unexpectedly finished.

when debugging the problem lies with line 44 in mtcnn.cpp

pnet_engine = new Pnet_engine[scales_.size()];
simpleFace_ = (Pnet**)malloc(sizeof(Pnet*)*scales_.size());
for (size_t i = 0; i < scales_.size(); i++) {
    int changedH = (int)ceil(row*scales_.at(i));
    int changedW = (int)ceil(col*scales_.at(i));
    pnet_engine[i].init(changedH,changedW); <--------- when the are negative values

I was just calling the attached photo
image_test("/home/jetson/git/MTCNN_FaceDetection_TensorRT/4.jpg");

Problem about model training

Thanks for you sharing.
I have question about what data augmentation strategy you have used when trainging this mtcnn model, and could you tell me which dataset you have used if you train this model by public dataset.

core dumped

terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::replace: __pos (which is 15) > this->size() (which is 0)
Aborted (core dumped)

Int8Calibrator

I am using TensorRT 5 and trying to add the code for Int8 Quantization. I tried adding the following lines in baseEngine.cpp but it is giving me an error.

builder->setInt8Mode(true);
IInt8Calibrator* calibrator;
builder->setInt8Calibrator(calibrator);

WARNING: Int8 mode specified but no calibrator specified. Please ensure that you supply Int8 scales for the network layers manually.
ERROR: Calibration failure occured with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.

core dump from detect function

I get opencv Error: Assertion failed (0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows) in GpuMat, file cuda_gpu_mat.cpp, line 152

I see that the assertion can be from all the places we create a new image with Rect temp((*it).y1, (*it).x1, (*it).y2-(*it).y1, (*it).x2-(*it).x1)
but opencv rectangle get Rect_ (_Tp _x, _Tp _y, _Tp _width, _Tp _height)
why the x and y are fliped?
how can I fix this core?

stack smashing detected

thanks in advance.
while running the main file, program is throwing stack smashing error during generating TenosrRT runtime models.

at caffeToGIEModel function inside pnet_rt.cpp line46 program is throwing this error and process is getting terminated with signal 6 (SIGABRT)

and can you explain or provide link for gLogger, which is used as a parameter in createInferBuilder() [baseEngine.cpp line 32]

多进程运行报错

感谢大佬分享~
demo 我已经跑通,不过我使用多进程运行的时候,遇到了以下错误。
[TensorRT] ERROR: ../rtSafe/cuda/reformat.cu (925) - Cuda Error in NCHWToNCHHW2: 400 (invalid resource handle)
[TensorRT] ERROR: FAILED_EXECUTION: std::exception
希望大佬指教。感谢!

could not parse layer type PReLU?

Thanks your codes.
after cmake and make ,then i run it , the result are as follows:
Beging parsing Pnet model...
could not parse layer type PReLU
End parsing Pnet model
Segmentation fault (core dumped)

I use tensorrt2.1,cudnn7.0,cuda 8.0,opencv2.4.13.
Could you tell me how to solver this problem?

Prelu replacement - "Weights for scale layer" doesn't exists

Hi,
Was the model trained on relu+scale combination or it was trained on prelu and you just replace the prelu with equivalent operations only in the prototxt file.
I have a model which was trained on prelu, but replacing it with relu+scale combination in the prototxt gives me a "Weights for scale layer" doesn't exists on Tensorrt. Any idea how to solve the issue?

inference speed too slow

I run your demo in a tensorRT5.0 docker image, found the speed of inference on your 4.jpg was too slow.
My environment is: ubuntu16.04 + cuda9.0 + cudnn7.3.1 + tensorRT5.0.
Here is the Log:

Start generating TenosrRT runtime models
End generating TensorRT runtime models
first model inference time is 0.842
first model inference time is 0.511
first model inference time is 0.396
first model inference time is 0.313
first model inference time is 0.296
first model inference time is 0.266
first model inference time is 0.254
first time is 3.134
second time is 13.168
third time is 7.437
first model inference time is 0.612
first model inference time is 0.431
first model inference time is 0.344
first model inference time is 0.282
first model inference time is 0.266
first model inference time is 0.251
first model inference time is 0.269
first time is 2.672
second time is 15.089
third time is 7.409
time is 25.31

Do you have any idea about this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.