Git Product home page Git Product logo

sfd.pytorch's Introduction

sfd.pytorch

sfd implementation for face recognition in pytorch. Paper at: SFD: Single Shot Scale-invariant Face Detector

Need help 💦

This repo is still under developing, any issue or pull request is very welcome. Currently the weights given bellow achieve 0.7 mAP on all the validation set of wider face. I'm still training it with my poor 1080ti. Anyone with more computing resources wants to train this please open an issue to get in touched.

Requirements

  • Python 3.6
  • Pytorch 0.4
  • TensorBoard(Optional)

TODOs

  • Training on wider faces.
  • Inference tools and API.
  • Non-maximum suppression at reference.
  • TensorBoard supported.
  • Evaluation.
  • Image augmentation.
  • Multi-class detection.

Detection

The detector.py is executable and programmable, see inference.ipynb for a quick look at how to use the detector API. Using the following command for directly use it in the command line.

python3 detector.py --image ./image/test.jpg --model ./epoch_204.pth.tar

The trained model epoch_204.pth.tar can be downloaded from Baidu Yun or Google Drive.

The detector will draw bounding boxes and the result is showing bellow

Train

To train with the wider_face dataset, download and extract everything in one directory named wider_face. The file trees should then look like this,

└── wider_face
    ├── Submission_example.zip
    ├── wider_face_split
    │   ├── readme.txt
    │   ├── wider_face_test_filelist.txt
    │   ├── wider_face_test.mat
    │   ├── wider_face_train_bbx_gt.txt
    │   ├── wider_face_train.mat
    │   ├── wider_face_val_bbx_gt.txt
    │   └── wider_face_val.mat
    ├── wider_face_split.zip
    ├── WIDER_test
    │   └── images
    ├── WIDER_test.zip
    ├── WIDER_train
    │   └── images
    ├── WIDER_train.zip
    ├── WIDER_val
    │   └── images
    └── WIDER_val.zip

In the config.py, set the DATASET_DIR to the path of wider_face, and set the LOG_DIR to whatever but a existed directory. Now it's ready to train with the following command,

python3 main.py # there is no stdout

The training log is in LOG_DIR/log.txt, and models will be saved at LOG_DIR/models/epoch_xx.pth. There are many options in config.py(including learning rate or resumption) for you to tweak to get a better model.

TensorBoard

If you have TensorBoard installed, set TENSOR_BOARD_ENABLED to True in the config.py, you can use the following command to quickly start the TensorBoard server.

./bin/tensorboard

To visualize how the loss is changing.

sfd.pytorch's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

sfd.pytorch's Issues

Scoring mechanism.

The score seems too strange to me.
some of them are 18-19 and others 8-6.
and the threshold has been kept to 6.
What is the score range?

about loss implementation

@louis-she HI

在paper中,在计算loss时,有这样的描述:

The two terms are normalized by Ncls and Nreg , and weighted by a balancing parameter λ . 
In our implementation, the cls term is normalized by the number of positive and negative anchors, and the reg term is normalized by the number of positive anchors.  Because of the imbalance between the number of positive and negative anchors, λ is used to balance these two loss terms.

换句话说,理论上,分类损失除以(正样本+负样本)个数,回归损失除以(正样本)个数,然后给分类损失乘以一个权重(4).
比如正样本个数是100,那么分类损失就是除以400,然后乘以4;回归损失就是除以100.

但在code中,在计算2个损失时,

loss_class = 4 * F.cross_entropy(total_effective_pred, total_targets)
loss_reg = F.smooth_l1_loss(total_t, total_gt)

此时,相当于分类损失和回归损失都除以了(正样本+负样本)个数,都是400

那么再给分类损失乘以权重4是不是有问题啊??

about anchor

@louis-she HI

感觉anchor的设置有点问题.感觉您是完全按照paper中的fig1(3)设置的.
比如:
image的size为640.
在5*5的特征图上,anchor尺度为512,stride为128.根据您的代码,第1个anchor的坐标形式为(256,256,512,512).中心点的位置很奇怪.难道不应该是(64,64,512,512)这样的anchor吗??

about scale compensation anchor matching strategy

@louis-she HI

结合caffe源码的#12和paper,感觉:

对于本文的scale compensation anchor matching strategy,其是指在降低匹配阈值(0.5到0.35)后,如果image中某个face匹配的anchor数小于6个,才进行2-stage匹配,而且每个face的匹配上限是6.
但在您的代码中,考虑的是image中所有face匹配的anchor数(上限100),并不是单个face匹配的anchor数

train on multiple gpus

can the code trained on multiple gpus?
when training on a single gpu, it just fine.
while training on multiple ones, errors like as below happen
"RuntimeError: arguments are located on different GPUs at /opt/conda/conda-bld/pytorch_153│| No running processes found |
3672544752/work/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu:314"

thanks~

fps

您好,我想请问一下您复现出S3FD代码,在测试时间的速度是多少?谢谢

Change Anchor Boxes Aspect Ratio

Dear @louis-she,
Thank you for your nice work.
As Shifeng Zhang has mentioned in his paper: Our anchors are 1:1 aspect ratio (_i.e.,_ square anchor), the aspect ratios of all anchor boxes are 1:1. However, if one wants to change the aspect ratios of anchor boxes for example, to 1:1.4 (width:height) the network structure must be changed or just the training code & decoding strategy (i.e., utils.py or anchor.py) must be changed? Would you please explain how one can change the aspect ratios of anchor boxes?

some missing attributes in config.py

hi Louis,
I think there are some attributes in config.py such as RANDOM_CROP, RANDOM_FLIP ,MIN_CROPPED_RATIO, etc.
Could you please share the config.py file you used for training?
Many thanks,
Yiming

about ‘maxout ’

hi,I don't seem to see you has done ‘’maxout‘’ in the code. It is different from the paper?

about bbox regression

@louis-she HI
在利用anchors坐标网络预测的偏移量得到最终box坐标时,都会引入一个参数variances.但在您的代码中,并未看到此参数.

x = (predictions[:, 0] * anchors[:, 2] + anchors[:, 0])
y = (predictions[:, 1] * anchors[:, 3] + anchors[:, 1])
w = (torch.exp(predictions[:, 2]) * anchors[:, 2])
h = (torch.exp(predictions[:, 3]) * anchors[:, 3])

bounding_boxes = torch.stack((x, y, w, h), dim=1).cpu().data.numpy()  #(200,4)
bounding_boxes = change_coordinate_inv(bounding_boxes)

quesions about mark_anchors in anchor.py

In the paper,
3.2 Scale compensation anchor matching strategy
stage two is used to compensate the faces with few anchors.
However, in anchor.py, mark_anchors seems to compensate all the faces

import error:cannot import name 'cfg'

Then I run the command:
python3 detector.py --image ./image/test.jpg --model ./epoch_204.pth.tar
Then I got this error:
Traceback (most recent call last):
File "detector.py", line 11, in
from model import Net
File "E:\code\S3FD\sfd.pytorch-master\model.py", line 5, in
from torchvision.models.vgg import VGG, cfg, make_layers, vgg16
ImportError: cannot import name 'cfg'
And I still don't figure out what goes wrong

about tiny face

@louis-she HI
根据caffe源码#7,因为eval_tools的原因,需要屏蔽掉tiny faces.原作者指出:When training,we do not use these extremely tiny faces (i.g., width or height<6 pixels).
但在您的代码中,并未体现这一点!!

谢谢!

dataset adapter

Dear @louis-she & @pengbo0054,
In

sfd.pytorch/dataset.py

Lines 97 to 100 in 3f62b17

x[0] * height_scale,
x[1] * width_scale,
x[2] * height_scale,
x[3] * width_scale
, you multiply height scale to x[0] & x[2] and width scale to x[1] & x[3].
However, in the readme of Wider-Face data set mentioned that

The format of txt ground truth is as follows: 
File name
Number of bounding box
x1, y1, w, h, blur, expression, illumination, invalid, occlusion, pos

I mean that, I guess a mistake occur in this part of code. Maybe you should change the dataset.py as follows:

        # scale coordinate
        height, width = image.shape[:2]
        width_scale, height_scale = 640.0 / width, 640.0 / height
        coordinates = np.array(list(map(lambda x: [
            x[0] * width_scale,  # Change this part
            x[1] * height_scale,  # Change this part
            x[2] * width_scale,  # Change this part
            x[3] * height_scale  # Change this part
], coordinates)))

Am I correct?

Just as another note. I suggest that you add a general ListDataset class instead of specific data sets (e.g., Wider-Face, Pascal VOC, etc.). For example the ListDataset class can use data set annotations as follows:

Load image/labels/boxes from a list file (e.g., *.txt file).
The list file is like:
a.jpg xmin ymin xmax ymax label xmin ymin xmax ymax label ...

Thank you

Error while testing the detector

Hi, thank you for the great work.
I was trying to test your code, but when I run:
python3 detector.py --image ./images/test.jpg --model ./epoch_41.pth.tar
I got the following error.

predicted bounding boxes of faces:
Traceback (most recent call last):
  File "detector.py", line 160, in <module>
    main(args)
  File "detector.py", line 141, in main
    bboxes = Detector(args.model).infer(args.image)
  File "detector.py", line 24, in __init__
    self.model.load_state_dict(checkpoint['state_dict'], strict=True)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 721, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Net:
	Missing key(s) in state_dict: "predict3_3_reg.weight", "predict3_3_reg.bias", "predict4_3_reg.weight", "predict4_3_reg.bias", "predict5_3_reg.weight", "predict5_3_reg.bias", "predict_fc7_reg.weight", "predict_fc7_reg.bias", "predict6_2_reg.weight", "predict6_2_reg.bias", "predict7_2_reg.weight", "predict7_2_reg.bias", "predict3_3_cls.weight", "predict3_3_cls.bias", "predict4_3_cls.weight", "predict4_3_cls.bias", "predict5_3_cls.weight", "predict5_3_cls.bias", "predict_fc7_cls.weight", "predict_fc7_cls.bias", "predict6_2_cls.weight", "predict6_2_cls.bias", "predict7_2_cls.weight", "predict7_2_cls.bias". 
	Unexpected key(s) in state_dict: "predict3_3.weight", "predict3_3.bias", "predict4_3.weight", "predict4_3.bias", "predict5_3.weight", "predict5_3.bias", "predict_fc7.weight", "predict_fc7.bias", "predict6_2.weight", "predict6_2.bias", "predict7_2.weight", "predict7_2.bias". 

It seems to be something wrong in the provided model or the way to read the model, but I am new to pytorch and couldn't figure out the problem. I have pytorch 0.4 and python 3.6 as required.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.