anson0910 / cnn_face_detection Goto Github PK
View Code? Open in Web Editor NEWImplementation based on the paper Li et al., “A Convolutional Neural Network Cascade for Face Detection, ” 2015 CVPR
Implementation based on the paper Li et al., “A Convolutional Neural Network Cascade for Face Detection, ” 2015 CVPR
hi!
Thanks for you implementation of the cascade cnn. You are doing a great job, and i learn a lot from it.
However, i can't find some files you mentioned in the readme, such as CNN_face_detection_models/create_lmdb_scripts/ models/face_12c/solver.prototxt.
Can you please tell me what happened? can i get that files?
thanks lot!
Best regards
Xing Wang
hi~
In your code, when you prepare calibration data, you just use the formula in the paper. For example, when the label is 0, s=0.83, x=-0.17, y=-0.17, the window will go right and down.
When you detect a face, the current window is a little bit right and down to the correct window, then you apply calibration net, and the label is 0. Obviously you can not take the same parameter 0.83,-0.17,-0.17 because it will go right and down further.
You have help me so much,if you come to shanghai, remember to call me I will treat you dinner.
1.In your code,I see these two things(I do not know how to call them),what are they used to do ? Is there any difference?
2.what are the difference between the following choose?
if loadNet:
net_12_cal = caffe.Net(MODEL_FILE, PRETRAINED, caffe.TEST)
else:
net_12_cal = caffe.Classifier(MODEL_FILE, PRETRAINED,
mean=np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy').mean(1).mean(1),
channel_swap=(2,1,0),
raw_scale=255,
image_dims=(15, 15))
I use happynear's caffe flow and build it under vs2015, cuda 8.0, anaconda2-4.3.1 and no cudnn. I have successed building caffe with python wrapper, but I dont know how to use our codes and model to do a face detection test for I am the first time using python codes. Can anyone give me some pipe or tips to continue the experiment. Thanks a lot.
line 60-62
after calling detect_face_12c_net, no sort by confidence, then call LocalNMS which require input bboxes are sorted by confidence
is that a bug?
And would you please to share ROC on benchmark?
sorry to trouble you again,I have some questions about the code and the paper
1.current_rectangle = [int(2_current_x_current_scale), int(2_current_y_current_scale),
int(2_current_x_current_scale + net_kind_current_scale),
int(2_current_y_current_scale + net_kind_current_scale),
confidence, current_scale]
what is the meaning of 2 in the code(2_current_x_current_scale)?
in the paper it says built into image pyramid to cover faces at different scales ,in your code the image is only been narrowed without enlarge,is it right?
in the paper densely scan image of size 800 × 600 for 40 × 40 faces with 4-pixel spacing, which generates 2, 494 detection windows. The time reduces to 10 ms on a GPU card, most of which is
overhead in data preparation.Can you tell you how the 2, 494 is been calculated?
in 12 net it says 12 × 12 detection windows,is it because the net input is 12*12 so the window is 12?
is 4-pixel spacing corresponding to the train_val.prototxt and how the 4 is been calculated?
thank you for help me so much and are you chinese?
hi, why there are differences between 48net's train_val.prototxt and its deploy.prototxt?
for example, no "norm layer" in deploy and a "dropout layer" which is not in train_val.prototxt below "relu3 layer"
i need calibration_AFLW.py new code
dear anson0910:
I am so appreciated your work. I have a question about the 3000 images. Is there a trick to select these negative images? for example: 1. Keep various 2. use the residue part of a image with moving faces out (AFLW dataset) 3. just use some scenery picture?
Look forward your reply.
Hi, I'm trying to reproduce the results in the paper. Can you share about the approximate threshold t1 and t2 in training steps and approximate negative samples in training cali24 and cali48 net? Also, have you tested on AFW ? I tried a lot methods but best AP is just about 90%. Thanks!
Dear anson:
I have ran you code before and now I'd like to train my own caffee model.
But I don't understand the training step. For example, I don't know what training set I should use. According to my understanding, I should use 12*12 pixels image when I training the 12net. But in you code create_negative.py I find that you crop image to different size. I don't understand why you do so.
So I do hope you can answer my question if convenient. Thank you very much!
Sorry to bother you. I have run the test code of yours. But the output image has many useless boxes. I have increased the threshold and confidence. Please help me~Thx a lot!!!
Dear anson0910 :
Could you please show me an example of how to implement the Multi-resolution net structure?
I really don't know how to modify the .ptototxt file to implement, thank you very much!!
Dear anson,
I've trained a detection net use my own data. Now I want to train calibration nets. But I don't know how to do so. What training set should I use ? And how should I mark the label?
Hope you will answer my question if convient. Thank you very much!
sorry to trouble,why train the 24 and 48 detect net should use the negative data from the former detect net(the rectangles from the former result) ,not the negative data same as 12 directly?
Hi!
when I am trying to run create negative_py, I get the error:
read_img_name = data_base_dir + '/' + file_list[current_image].strip()
IndexError: list index out of range
can u help me with this please?
I have just created 340 scenery images in the directory
btw, I am not a pro. just a beginner
thanks a lot!
Dear anson,
I've read you code in create_face_12c.sh. I found that you resize the image to 60 * 60. I don't understand why you do so. And by the way why do you set the RESIZE_HEITHG and RESIZE_WIDTH both to be 15?
Hope for your answering. Thank you very much!
@anson0910
Hi, can you provide train_val.prototxt file for training face12c-FCN?
I only found CNN_face_detection_models / face_12c / face12c_full_conv.prototxt. It's used to test.
And I don't know how to write the train_val.prototxt of FCN. Forgive me..
Thank you very much!
Could you demonstrate the evaluate result on FDDB?(The recall rate or discontinuous ROC curve)
in 24-net , did you not include 12-net full connected layer?and same in 48-net, did you not include 24-net full connected layer?
hi,
I'm training the 6 nets using caffe. I find that my nets converge very fast. The 12-net, for example, it starts at loss: 0.69 and acc:0.68, however after 1000 iteration, the loss gets 0.01 and the acc gets 0.994, then they only changes in very small range.
Does it mean that the net meet overfitting problem?
I test my net, I found that it doesn't filter non-face region as much as expected.
How does it happen?
Thanks
hi~ sorry to trouble again~
I have a question about the nms algorithm. In your face_detection_functions.py, the "globalNMS" function has a condition at line 126, "result_rectangles[cur_rect_to_compare][5] < 0.85", which means the scale should be less than 0.85. However if you detect faces with min size 32, the scale cannot be less than 32/12. So this condition can never be true.
So what does this condition for in this function?
Dear Anson,
I have used face_cascade_fullconv_single_crop_single_image.py with files face12c_full_conv.prototxt and face12c_full_conv.caffemodel to detect the face in a image, it works OK.
Then I used face_12c/solver.prototxt and face_12c/train_val.prototxt to train the model file face_12c_train_iter_400000.caffemodel(accuracy=0.882).
But I detected many faces everywhere in the image with the model file face_12c_train_iter_400000.caffemodel. Should I and how to get face12c_full_conv.caffemodel from face_12c_train_iter_400000.caffemodel?
The file size of face12c_full_conv.caffemodel is 28038 bytes, but the file size of face_12c_train_iter_400000.caffemodel I trained is 28248 bytes.
Thanks
Dear anson,
I'm sorry for trouble you again. I find that in your code, the part of detecting face, your 24 net just use the result of 12net directly. But in the paper I find that the 24net and 12net are only connected in fully-connected layer, so as 48net and 24net.
So please tell me whether I have a wrong understanding if convient.
Thank you for your help!
hi @anson0910 ,
You said "Multi-resolution is not used for simplicity, you can add them in the .prototxt files under CNN_face_detection_models to do so" in readme. It seems that it is very easy to add this feature. Would you please to explain it more clear?
Dear anson:
I have ran you code before and now I'd like to train my own caffee model.
But I don't understand the training step. For example,I do not know whether the six networks can train at the same time. If training alone, where to get the input of net_12_cal, net_24c ? What is the difference between the face_12c and face_12c2 folder model? What are the results of * _SRquantize _ *. Caffemodel and * _quantize _ *. Caffemodel preserveing? After 400000 iterations, accuracy = 0.5, loss = 0.64 almost unchanged by training 12-net alone, what should be the result of this result.
So I do hope you can answer my question if convenient. Thank you very much!
hi, @anson0910
I collect about 20000 faces from AFLW and 60000 backgroud patch for net-12 training.it converged very shortly after about 10000 iteration.
However, calibrate-12 can not converge after 100000 iteration using about 2000 faces. each faces generate 45 training patches according to original paper. As a result, there are about 2000 * 45 = 100000 training samples to train calibrate-12.
I have no clue about the problem. Can you give me some advice?
see ROC
It is tested on FDDB. Trained with 20K positive samples from AFLW and 60K backgroud images
hello
I have read the paper . the input image size of the net12 is 12 ☓ 12.
however ,in your code of create_lmdb_scripts->face_12c->create_face_12c.sh,you resized the image into the size of 15. One the other hand, in the face_12c->deploy.prototxt file, the input dim is 3☓12☓12. could you tell me the reason?Thank you very much.
Hi, @anson0910
I trained net-12 model using CNN_face_detection_models/face_12c/train_val.prototxt.
And I load this model with CNN_face_detection_models/face_12c/deploy.prototxt
After that, I call detect_face_12c_net in CNN_face_detection/face_detection/face_detection_functions.py. It throwed error like "inner_product_layer.cpp:64] Check failed: K_ == new_K (400 vs. 605472) Input size incompatible with inner product parameters."
I think it was caused by input image size.
In face_detection_functions.py, the original test image are resized by multi-scale, which are larger than 3_12_12. But deploy.prototxt requires 3_12_12 image as input. It seems that caffe didnt do any sliding-window job automatically
Hi anson.
when i was creating LMDB files required to train all nets i was mentioned image resize to 256 for width and height and i created LMDB files. is that causes accuracy changes when i will test model. if it is what are the changes i need to do for getting the better result.
Thanks&Regards
N G K Sai
I am running your method by using face_cascade_fullconv_fddb.py under GPU K80. The speed is quite quite slow.
Processing file 1 ...
Processing image : 0
Processing image : 10
Processing image : 20
Processing image : 30
Processing image : 40
Processing image : 50
Processing image : 60
Processing image : 70
Processing image : 80
Processing image : 90
Processing image : 100
Processing image : 110
Processing image : 120
Processing image : 130
Processing image : 140
Processing image : 150
Processing image : 160
Processing image : 170
Processing image : 180
Processing image : 190
Processing image : 200
Processing image : 210
Processing image : 220
Processing image : 230
Processing image : 240
Processing image : 250
Processing image : 260
Processing image : 270
Processing image : 280
Average time spent on one image : 11.2854675207 s
Do you know the reason?
hello ,dear anson ,I use the git to detect my test images ,but has many false face rectangles , Can you help me to deal with it ? THX
So, I emailed the AFLW guys, and they sent me the password, and I downloaded the stuff. However, in your aflw.py file where we are supposed to train with the AFLW dataset, at the top of the file we have: read_file_name_faces = "/home/anson/face_pictures/AFLW/AFLW_Faces.txt" I looked through the downloads and couldn't find this AFLW_Faces.txt file anywhere. The same holds for AFLW_Rect.txt and AFLW_sex.txt. I was having a lot of trouble installing and using sqliteman and the other things that AFLW suggested, and was hoping to forgo that stuff and just go straight to taking the files I downloaded, putting them in the correct directory, and training the CNN. So my question is basically: is it vital that I get sqliteman working and do all that stuff, or should the files I named above be somewhere in the tar folders I downloaded from their website?
Itś really nice of you to share your code,but i do not know where to start at first.Can you show me how to learn your code or to read which folder first?
I train the 12_cal by myself but i find the loss is about 2.3 to 2.7 and the accuracy is about 0.3 after iteration 10W, is it right?
Hello, I downloaded you project and ran in my computer. Your work helps me a lot.
A problem is that my result is shown as follow:
Of course it's not good enough. I wonder it is because of my incorrect operation, or there are some bugs in your code/model? What is the result you run in you own environment? I never change your code or your caffee model.
Hope you would answer my question. Thanks a lot.
Hello, Anson.
First of all thank you for sharing your code. It was very helpful to me.
I ran benchmark on FDDB and got not very impressive results:
Does it mean that supplied models are only for demo and one should train its from scratch? Can I hope to obtain results similar to #18 in such way?
Thanks in advance.
Hi
do you have any script to extact AFLW from sql database?
I mean following files?
AFLW_Faces.txt
AFLW_Rect.txt
AFLW_sex.txt
Second question, do you need "Sex" for Face Detection or it is for other purpose?
Thanks
Hi
I have test your network on lfw. there are usually more than one rectangle on detection.
Is there any way to reduce the false postive?
Thank you very much.
hi @anson0910 ,
I noticed that the positive samples are generated by aflw.py. Right? However this script only output positive_16 ~ positive_17, rather than 1-15.
So I wonder what script positive 1-15 are generated from?
I managed to run the face_cascade_fullconv_single_crop_single_image.py and got 6 faces detected in 2002/07/19/big/img_352.jpg. Is this normal? Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.