anson0910 / cnn_face_detection Goto Github PK

View Code? Open in Web Editor NEW

253.0 253.0 149.0 12.18 MB

Implementation based on the paper Li et al., “A Convolutional Neural Network Cascade for Face Detection, ” 2015 CVPR

Python 100.00%

caffe cascade convolutional-neural-networks deep-learning face-detection machine-learning neural-networks

cnn_face_detection's People

Contributors

Stargazers

Watchers

Forkers

leezivin tskarthikeyann abnerdi liuhengli zhangsirm runauto caomw sunxingxingtf lijian8 ccsuzjj facegen zeyuan1987 hiyijian 857684922 mutual-ai justinbaby veterun dongdw millx2021 ronnie-tian facenx zuoshaobo panzhenfu chencjiajy chaunceywang moyangjiayou peterzs honghucode mvpduncan kissyzhou guiqulaxi tsingjinyun gaopeng-eugene caseyanya zhtsir hxl1990 tengshaofeng ddx10000 pimier15 soledad89 arasharchor realzheng caozhengquan firefaith bowrein jimmy-hu leezqcst v1ns0n asundaresan mjabri xuguozhi pierrehao papamadeleine2022 arrowyang88 galcy yuchiwang nowsyn wilixx vic4key suzhenghang cuimiao187561 tangzixia defuchocolate guker hxllimbo dengcy028 hyfine eiffiy pengkiki knightofdawn bibongo issac8huxley wangqiang1588 ieyer wuyuzaizai 1784266476 saurabheights justrypython lzd0825 ncut564227621 marc45 clhne elegantgod ronyuzhang maxwell2017 hitdong yuye1992 ai-face kwan-ywan bmei1314 proffl028 aijiajia jsmilemsj ubenz55555 wyc2015fq yrrselena zgsxwsdxg rgbitx albertma starstylesky

cnn_face_detection's Issues

can't find the .prototxt files

hi!

Thanks for you implementation of the cascade cnn. You are doing a great job, and i learn a lot from it.
However, i can't find some files you mentioned in the readme, such as CNN_face_detection_models/create_lmdb_scripts/ models/face_12c/solver.prototxt.

Can you please tell me what happened? can i get that files?

thanks lot!

Best regards
Xing Wang

Question about calibration data

hi~
In your code, when you prepare calibration data, you just use the formula in the paper. For example, when the label is 0, s=0.83, x=-0.17, y=-0.17, the window will go right and down.
When you detect a face, the current window is a little bit right and down to the correct window, then you apply calibration net, and the label is 0. Obviously you can not take the same parameter 0.83,-0.17,-0.17 because it will go right and down further.

About the quantizeBitNum and stochasticRoundedParams

You have help me so much,if you come to shanghai, remember to call me I will treat you dinner.
1.In your code,I see these two things(I do not know how to call them),what are they used to do ? Is there any difference?
2.what are the difference between the following choose?
if loadNet:
net_12_cal = caffe.Net(MODEL_FILE, PRETRAINED, caffe.TEST)
else:
net_12_cal = caffe.Classifier(MODEL_FILE, PRETRAINED,
mean=np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy').mean(1).mean(1),
channel_swap=(2,1,0),
raw_scale=255,
image_dims=(15, 15))

How to python caffe to test the model

I use happynear's caffe flow and build it under vs2015, cuda 8.0, anaconda2-4.3.1 and no cudnn. I have successed building caffe with python wrapper, but I dont know how to use our codes and model to do a face detection test for I am the first time using python codes. Can anyone give me some pipe or tips to continue the experiment. Thanks a lot.

Bug in face_preprocess_10kUS/create_negative_24c.py?

line 60-62
after calling detect_face_12c_net, no sort by confidence, then call LocalNMS which require input bboxes are sorted by confidence

is that a bug?

And would you please to share ROC on benchmark?

Expected result of running face_cascade_fullconv_single_crop_single_image.py script

Hi,

I ran face_cascade_fullconv_single_crop_single_image.py script as noticed in readme without any modifications and got this result:

Can you say if it is expected result? Or any setup should be done before running this script.

Thanks.

A few question sorry to trouble

sorry to trouble you again,I have some questions about the code and the paper
1.current_rectangle = [int(2_current_x_current_scale), int(2_current_y_current_scale),
int(2_current_x_current_scale + net_kind_current_scale),
int(2_current_y_current_scale + net_kind_current_scale),
confidence, current_scale]
what is the meaning of 2 in the code(2_current_x_current_scale)?

in the paper it says built into image pyramid to cover faces at different scales ,in your code the image is only been narrowed without enlarge,is it right?

in the paper densely scan image of size 800 × 600 for 40 × 40 faces with 4-pixel spacing, which generates 2, 494 detection windows. The time reduces to 10 ms on a GPU card, most of which is
overhead in data preparation.Can you tell you how the 2, 494 is been calculated?

in 12 net it says 12 × 12 detection windows,is it because the net input is 12*12 so the window is 12?

is 4-pixel spacing corresponding to the train_val.prototxt and how the 4 is been calculated?

thank you for help me so much and are you chinese?

about 48net and its deploy

hi, why there are differences between 48net's train_val.prototxt and its deploy.prototxt?
for example, no "norm layer" in deploy and a "dropout layer" which is not in train_val.prototxt below "relu3 layer"

calibration_AFLW.py new code

i need calibration_AFLW.py new code

3000 images without any faces (negative images)

dear anson0910:
I am so appreciated your work. I have a question about the 3000 images. Is there a trick to select these negative images? for example: 1. Keep various 2. use the residue part of a image with moving faces out (AFLW dataset) 3. just use some scenery picture?
Look forward your reply.

approximate Threshold T1 and T2

Hi, I'm trying to reproduce the results in the paper. Can you share about the approximate threshold t1 and t2 in training steps and approximate negative samples in training cali24 and cali48 net? Also, have you tested on AFW ? I tried a lot methods but best AP is just about 90%. Thanks!

About the training step

Dear anson:
I have ran you code before and now I'd like to train my own caffee model.
But I don't understand the training step. For example, I don't know what training set I should use. According to my understanding, I should use 12*12 pixels image when I training the 12net. But in you code create_negative.py I find that you crop image to different size. I don't understand why you do so.
So I do hope you can answer my question if convenient. Thank you very much!

About the test result

Sorry to bother you. I have run the test code of yours. But the output image has many useless boxes. I have increased the threshold and confidence. Please help me~Thx a lot!!!

How to implement the Multi-resolution net structure

Dear anson0910 :
Could you please show me an example of how to implement the Multi-resolution net structure?
I really don't know how to modify the .ptototxt file to implement, thank you very much!!

How to train calibration nets?

Dear anson,
I've trained a detection net use my own data. Now I want to train calibration nets. But I don't know how to do so. What training set should I use ? And how should I mark the label?
Hope you will answer my question if convient. Thank you very much!

about the negative data

sorry to trouble,why train the 24 and 48 detect net should use the negative data from the former detect net(the rectangles from the former result) ,not the negative data same as 12 directly?

create negative_py

Hi!
when I am trying to run create negative_py, I get the error:
read_img_name = data_base_dir + '/' + file_list[current_image].strip()
IndexError: list index out of range
can u help me with this please?
I have just created 340 scenery images in the directory
btw, I am not a pro. just a beginner
thanks a lot!

About the face size in create_face_12c.sh

Dear anson,
I've read you code in create_face_12c.sh. I found that you resize the image to 60 * 60. I don't understand why you do so. And by the way why do you set the RESIZE_HEITHG and RESIZE_WIDTH both to be 15?
Hope for your answering. Thank you very much!

train_val.prototxt about FCN

@anson0910
Hi, can you provide train_val.prototxt file for training face12c-FCN?
I only found CNN_face_detection_models / face_12c / face12c_full_conv.prototxt. It's used to test.
And I don't know how to write the train_val.prototxt of FCN. Forgive me..

Thank you very much!

The evaluate result

Could you demonstrate the evaluate result on FDDB?(The recall rate or discontinuous ROC curve)

24-net and 48-net

in 24-net , did you not include 12-net full connected layer?and same in 48-net, did you not include 24-net full connected layer?

About training

hi,
I'm training the 6 nets using caffe. I find that my nets converge very fast. The 12-net, for example, it starts at loss: 0.69 and acc:0.68, however after 1000 iteration, the loss gets 0.01 and the acc gets 0.994, then they only changes in very small range.
Does it mean that the net meet overfitting problem?
I test my net, I found that it doesn't filter non-face region as much as expected.
How does it happen?
Thanks

Question about nms algorithm

hi~ sorry to trouble again~
I have a question about the nms algorithm. In your face_detection_functions.py, the "globalNMS" function has a condition at line 126, "result_rectangles[cur_rect_to_compare][5] < 0.85", which means the scale should be less than 0.85. However if you detect faces with min size 32, the scale cannot be less than 32/12. So this condition can never be true.
So what does this condition for in this function?

How to get the file face12c_full_conv.caffemodel

Dear Anson,
I have used face_cascade_fullconv_single_crop_single_image.py with files face12c_full_conv.prototxt and face12c_full_conv.caffemodel to detect the face in a image, it works OK.
Then I used face_12c/solver.prototxt and face_12c/train_val.prototxt to train the model file face_12c_train_iter_400000.caffemodel(accuracy=0.882).
But I detected many faces everywhere in the image with the model file face_12c_train_iter_400000.caffemodel. Should I and how to get face12c_full_conv.caffemodel from face_12c_train_iter_400000.caffemodel?
The file size of face12c_full_conv.caffemodel is 28038 bytes, but the file size of face_12c_train_iter_400000.caffemodel I trained is 28248 bytes.

Thanks

A question about the cascade cnn

Dear anson,
I'm sorry for trouble you again. I find that in your code, the part of detecting face, your 24 net just use the result of 12net directly. But in the paper I find that the 24net and 12net are only connected in fully-connected layer, so as 48net and 24net.
So please tell me whether I have a wrong understanding if convient.
Thank you for your help!

How to train multi-resolution 24-net and 48-net?

hi @anson0910 ,
You said "Multi-resolution is not used for simplicity, you can add them in the .prototxt files under CNN_face_detection_models to do so" in readme. It seems that it is very easy to add this feature. Would you please to explain it more clear?

About the training step

Dear anson:
I have ran you code before and now I'd like to train my own caffee model.
But I don't understand the training step. For example,I do not know whether the six networks can train at the same time. If training alone, where to get the input of net_12_cal, net_24c ? What is the difference between the face_12c and face_12c2 folder model? What are the results of * _SRquantize _ *. Caffemodel and * _quantize _ *. Caffemodel preserveing? After 400000 iterations, accuracy = 0.5, loss = 0.64 almost unchanged by training 12-net alone, what should be the result of this result.
So I do hope you can answer my question if convenient. Thank you very much!

calibrate-12 can not converge when trainning, while net-12 converged very easily

hi, @anson0910
I collect about 20000 faces from AFLW and 60000 backgroud patch for net-12 training.it converged very shortly after about 10000 iteration.
However, calibrate-12 can not converge after 100000 iteration using about 2000 faces. each faces generate 45 training patches according to original paper. As a result, there are about 2000 * 45 = 100000 training samples to train calibrate-12.
I have no clue about the problem. Can you give me some advice?

ROC report:

see ROC
It is tested on FDDB. Trained with 20K positive samples from AFLW and 60K backgroud images

concerning lmdb

hello
I have read the paper . the input image size of the net12 is 12 ☓ 12.
however ,in your code of create_lmdb_scripts->face_12c->create_face_12c.sh,you resized the image into the size of 15. One the other hand, in the face_12c->deploy.prototxt file, the input dim is 3☓１２☓１２. could you tell me the reason?Thank you very much.

Failed to test using deploy.prototxt of net-12

Hi, @anson0910
I trained net-12 model using CNN_face_detection_models/face_12c/train_val.prototxt.
And I load this model with CNN_face_detection_models/face_12c/deploy.prototxt
After that, I call detect_face_12c_net in CNN_face_detection/face_detection/face_detection_functions.py. It throwed error like "inner_product_layer.cpp:64] Check failed: K_ == new_K (400 vs. 605472) Input size incompatible with inner product parameters."
I think it was caused by input image size.
In face_detection_functions.py, the original test image are resized by multi-scale, which are larger than 3_12_12. But deploy.prototxt requires 3_12_12 image as input. It seems that caffe didnt do any sliding-window job automatically

Resize images when creatiing LMDB file

Hi anson.
when i was creating LMDB files required to train all nets i was mentioned image resize to 256 for width and height and i created LMDB files. is that causes accuracy changes when i will test model. if it is what are the changes i need to do for getting the better result.

Thanks&Regards
N G K Sai

Speed Problem

I am running your method by using face_cascade_fullconv_fddb.py under GPU K80. The speed is quite quite slow.

Processing file 1 ...
Processing image : 0
Processing image : 10
Processing image : 20
Processing image : 30
Processing image : 40
Processing image : 50
Processing image : 60
Processing image : 70
Processing image : 80
Processing image : 90
Processing image : 100
Processing image : 110
Processing image : 120
Processing image : 130
Processing image : 140
Processing image : 150
Processing image : 160
Processing image : 170
Processing image : 180
Processing image : 190
Processing image : 200
Processing image : 210
Processing image : 220
Processing image : 230
Processing image : 240
Processing image : 250
Processing image : 260
Processing image : 270
Processing image : 280
Average time spent on one image : 11.2854675207 s

Do you know the reason?

many false face

hello ,dear anson ,I use the git to detect my test images ,but has many false face rectangles , Can you help me to deal with it ? THX

AFLW new website can't find AFLW_Faces.txt

So, I emailed the AFLW guys, and they sent me the password, and I downloaded the stuff. However, in your aflw.py file where we are supposed to train with the AFLW dataset, at the top of the file we have: read_file_name_faces = "/home/anson/face_pictures/AFLW/AFLW_Faces.txt" I looked through the downloads and couldn't find this AFLW_Faces.txt file anywhere. The same holds for AFLW_Rect.txt and AFLW_sex.txt. I was having a lot of trouble installing and using sqliteman and the other things that AFLW suggested, and was hoping to forgo that stuff and just go straight to taking the files I downloaded, putting them in the correct directory, and training the CNN. So my question is basically: is it vital that I get sqliteman working and do all that stuff, or should the files I named above be somewhere in the tar folders I downloaded from their website?

hello,a few questions.

Itś really nice of you to share your code,but i do not know where to start at first.Can you show me how to learn your code or to read which folder first?

Hi,about the train_cal accuracy and loss!

I train the 12_cal by myself but i find the loss is about 2.3 to 2.7 and the accuracy is about 0.3 after iteration 10W, is it right?

About the result after running

Hello, I downloaded you project and ran in my computer. Your work helps me a lot.
A problem is that my result is shown as follow:

Of course it's not good enough. I wonder it is because of my incorrect operation, or there are some bugs in your code/model? What is the result you run in you own environment? I never change your code or your caffee model.
Hope you would answer my question. Thanks a lot.

ROC report on supplied models

Hello, Anson.

First of all thank you for sharing your code. It was very helpful to me.

I ran benchmark on FDDB and got not very impressive results:

Does it mean that supplied models are only for demo and one should train its from scratch? Can I hope to obtain results similar to #18 in such way?

Thanks in advance.

AFLW files

Hi
do you have any script to extact AFLW from sql database?
I mean following files?
AFLW_Faces.txt
AFLW_Rect.txt
AFLW_sex.txt

Second question, do you need "Sex" for Face Detection or it is for other purpose?