Git Product home page Git Product logo

ssr-net's Introduction

SSR-Net

[IJCAI18] SSR-Net: A Compact Soft Stagewise Regression Network for Age Estimation

Code Author: Tsun-Yi Yang

Last update: 2019/09/19 (Renew the morph2 dataset link)

Real-time webcam demo

Paper

PDF

https://github.com/shamangary/SSR-Net/blob/master/ijcai18_ssrnet_pdfa_2b.pdf

Paper authors

Tsun-Yi Yang, Yi-Husan Huang, Yen-Yu Lin, Pi-Cheng Hsiu, and Yung-Yu Chuang

Abstract

This paper presents a novel CNN model called Soft Stagewise Regression Network (SSR-Net) for age estimation from a single image with a compact model size. Inspired by DEX, we address age estimation by performing multi-class classification and then turning classification results into regression by calculating the expected values. SSR-Net takes a coarse-to-fine strategy and performs multi-class classification with multiple stages. Each stage is only responsible for refining the decision of the previous stage. Thus, each stage performs a task with few classes and requires few neurons, greatly reducing the model size. For addressing the quantization issue introduced by grouping ages into classes, SSR-Net assigns a dynamic range to each age class by allowing it to be shifted and scaled according to the input face image. Both the multi-stage strategy and the dynamic range are incorporated into the formulation of soft stagewise regression. A novel network architecture is proposed for carrying out soft stagewise regression. The resultant SSR-Net model is very compact and takes only 0.32 MB. Despite of its compact size, SSR-Net’s performance approaches those of the state-of-the-art methods whose model sizes are more than 1500x larger.

Platform

  • Keras
  • Tensorflow
  • GTX-1080Ti
  • Ubuntu

Dependencies

pip install mtcnn
conda install -c conda-forge moviepy
conda install -c cogsci pygame
conda install -c conda-forge requests
conda install -c conda-forge pytables

Codes

There are three different section of this project.

  1. Data pre-processing
  2. Training and testing
  3. Video demo section We will go through the details in the following sections.

This repository is for IMDB, WIKI, and Morph2 datasets.

1. Data pre-processing

cd ./data
python TYY_IMDBWIKI_create_db.py --db imdb --output imdb.npz
python TYY_IMDBWIKI_create_db.py --db wiki --output wiki.npz
python TYY_MORPH_create_db.py --output morph_db_align.npz

2. Training and testing

The experiments are done by randomly choosing 80% of the dataset as training and 20% of the dataset as validation (or testing). The details of the setting in each dataset is in the paper.

For MobileNet and DenseNet:

cd ./training_and_testing
sh run_all.sh

For SSR-Net:

cd ./training_and_testing
sh run_ssrnet.sh
  • Note that we provide several different hyper-parameters combination in this code. If you only want a single hyper-parameter set, please alter the command inside "run_ssrnet.sh".

Plot the results: For example, after the training of IMDB dataset, you want to plot the curve and the results. Copy "plot.sh", "ssrnet_plot.sh", and "plot_reg.py" into "./imdb_models". The following command should plot the results of the training process.

sh plot.sh
sh ssrnet_plot.sh

3. Video demo section

Pure CPU demo command:

cd ./demo
KERAS_BACKEND=tensorflow CUDA_VISIBLE_DEVICES='' python TYY_demo_mtcnn.py TGOP.mp4

# Or you can use

KERAS_BACKEND=tensorflow CUDA_VISIBLE_DEVICES='' python TYY_demo_mtcnn.py TGOP.mp4 '3'
  • Note: You may choose different pre-trained models. However, the morph2 dataset is under a well controlled environment and it is much more smaller than IMDB and WIKI, the pre-trained models from morph2 may perform ly on the in-the-wild images. Therefore, IMDB or WIKI pre-trained models are recommended for in-the-wild images or video demo.

  • We use dlib detection and face alignment in the previous experimental section since the face data is well organized. However, dlib cannot provide satisfactory face detection for in-the-wild video. Therefore we use mtcnn as the detection process in the demo section.

Real-time webcam demo

Considering the face detection process (MTCNN or Dlib) is not fast enough for real-time demo. We show a real-time webcam version by using lbp face detector.

cd ./demo
KERAS_BACKEND=tensorflow CUDA_VISIBLE_DEVICES='' python TYY_demo_ssrnet_lbp_webcam.py
  • Note that the covered region of face detection is different when you use MTCNN, Dlib, or LBP. You should choose similar size between the inference and the training.
  • Also, the pre-trained models are mainly for the evaluation of the datasets. They are not really for the real-world images. You should always retrain the model by your own dataset. In webcam demo, we found that morph2 pre-trained model actually perform better than wiki pre-trained model. The discussion will be included in our future work.
  • If you are Asian, you might want to use the megaage_asian pre-trained model.
  • The Morph2 pre-trained model is good for webcam but the gender model is overfitted and not practical.

4. Extension

Training the gender model

We can reformulate binary classification problem into regression problem, and SSR-Net can be used to predict the confidence. For example, we provide gender regression and demo in the code for the extension.

Training the gender network:

cd ./training_and_testing
sh run_ssrnet_gender.sh

Note that the score can be between [0,1] and the 'V' inside SSR-Net can be changed into 1 for general propose regression.

Third Party Implementation

ssr-net's People

Contributors

kant avatar shamangary avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ssr-net's Issues

Network architicture

It seems like that the network architicture in the code of SSR_Net.py is different from paper.For example only one pooling layer in every stream of stage 1 but two in paper @shamangary

Accuracy question about gender estimation

hello, I use SSR-Net to estimate gender with wiki pre-model, but the value obtained does not seem to be a confidence, the values of males are greater than 1, and the values of females is not less than 0.5, mostly is 0.9999.... , so there are many errors with 0.5 as threshold.
Is there a problem with my test?
Thank you!

Positive mean errors for ages >50yrs

I have trained the SSR-Net on my private dataset and the best model has MAE of ~4yrs
Apart from heteroscedasticity, while the mean MAE for smaller ages is zero, for ages > 50 I am seeing large positive mean of errors. Is there any way to resolve this?

Why stage1 (layer4) has no pooling layer?

Excuse me. I read your code and I didn't see any pooling layer(AvgPooling or MaxPooling) in Stage1(code: layer4). But it does actually exist in Stage3(code: layer1) and Stage2(code: layer2).

Why is this difference? Thank you.

Error when training the dataset. Please help!

Could someone please help me. Im getting the error and I cannot understand why. I get it when I run the code:
python3 create_db.py --output data/imdb_db.mat --db imdb --img_size 64

Is it something to do with a dependency not installed?
Please help me!

Code Error in full;

(EnVee) D:\FYP\FYP Platforms\SSR-Net-master\data>python TYY_IMDBWIKI_create_db.py --db imdb --output imdb.npz
Traceback (most recent call last):
File "TYY_IMDBWIKI_create_db.py", line 61, in
main()
File "TYY_IMDBWIKI_create_db.py", line 34, in main
full_path, dob, gender, photo_taken, face_score, second_face_score, age = get_meta(mat_path, db)
File "D:\FYP\FYP Platforms\SSR-Net-master\data\TYY_utils.py", line 17, in get_meta
meta = loadmat(mat_path)
File "C:\Users\VenuraKetipearachchi\Anaconda3\envs\EnVee\lib\site-packages\scipy\io\matlab\mio.py", line 142, in loadmat
matfile_dict = MR.get_variables(variable_names)
File "C:\Users\VenuraKetipearachchi\Anaconda3\envs\EnVee\lib\site-packages\scipy\io\matlab\mio5.py", line 272, in get_variables
hdr, next_position = self.read_var_header()
File "C:\Users\VenuraKetipearachchi\Anaconda3\envs\EnVee\lib\site-packages\scipy\io\matlab\mio5.py", line 226, in read_var_header
mdtype, byte_count = self._matrix_reader.read_full_tag()
File "mio5_utils.pyx", line 548, in scipy.io.matlab.mio5_utils.VarReader5.read_full_tag
File "mio5_utils.pyx", line 556, in scipy.io.matlab.mio5_utils.VarReader5.cread_full_tag
File "streams.pyx", line 171, in scipy.io.matlab.streams.ZlibInputStream.read_into
File "streams.pyx", line 158, in scipy.io.matlab.streams.ZlibInputStream._fill_buffer
zlib.error: Error -3 while decompressing data: invalid code lengths set

预测值没有年龄15岁以下的

拿模型predict的时候,发现小孩儿and 婴儿,全部预测15+岁。我看那个megaage-asian数据集是0-70,所以为什么我预测没有小孩那一段的,是模型本身的局限还是我哪里没搞对??

another question

HI, @shamangary can I use this net to train age and gender at the same time? and if gender has 3 classes ,ie (0: other, 1: female, 2: male) , in merge_age function V should be set V=1 or V=3?

change input size to 112X112

您好,请问下如果需要将输入图片修改为112X112,是否可以在第一个卷积前面加一个正常的3*3 ,stride为2的卷积就可以了。效果会比64X64的应该也会好吧。

How did you crop and aligh MORPH II?

Hi, Thanks for sharing this awesome source code. It helps me a lot. Currently, I am trying to reproduce your result on morph2 dataset. Your script use morph2_db_align.npz file to train the network. Because the file's name has a '_align' at the end, so I suppose you made an preprocessing step for this dataset. Can you tell me which preprocessing you used?

Question about results

How accurate is your model with predicting older people (60+)? Does it predict them 95% always younger than their real age?

MegaAge-Asian Testing Script

Hi, in your paper, you have tested SSR-Net model on MegaAge-Asian dataset and got result [CA(3): 0.549, CA(5):0.741].
I modified the script TYY_demo_mtcnn.py, use your "pre-trained/megaface_asian/ssrnet_3_3_3_64_1.0_1.0/ssrnet_3_3_3_64_1.0_1.0.h5" and read MegaAge-Asian images as input to test.
But I can't get the approximate testing result, can you provide the testing script? Thank you!

negative age

Hi,shamangary:
In the age prediction process,the age is negative,is there a way to avoid this?
Thanks。

the result is very poor

it is amazing work, thank for u work. But the result is very poor except for gender . how can i improve the model

memory leak?

Hello, I am running the program on 1080Ti, but the memory is very large, about 10 G, is it a memory leak? Does this happen to you? thank you very much

one train question

here , SSR-NET use MAE to be loss function, I try to write code using tensorflow ,
and the error is like that:
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients,

and the loss code like that:
logits_loss_age = tf.reduce_mean(tf.losses.absolute_difference(tf.cast(self.classifier_logits,tf.int32),age_labels))
optimizer = tf.train.AdamOptimizer(learning_rate,epsilon=1e-4)
train_op = optimizer.minimize(all_loss)

how can I resolve this quesiton?

性别识别

请问得到的性别识别的准确率有多高呢?

.h5 model to .pb model but output node is strange!

I transfer h5 model to .pb model but output node is strange, the output node is pred_a/mul_33, I can't find this node in the tensorflow graph, do you know the reason, thanks!

image

image

image

find pred_a/mul_33 node, but I don't know what the node means.

wrong model folders

error in line 76 to 81 in master/training_and_testing/SSRNET_train.py
if db_name == "wiki":
weight_file = "wiki_models/"+save_name+"/"+save_name+".h5"
model.load_weights(weight_file)
elif db_name == "imdb":
weight_file = "imdb_models/"+save_name+"/"+save_name+".h5"
model.load_weights(weight_file)
elif db_name == "morph":
weight_file = "morph_models/"+save_name+"/"+save_name+".h5"
model.load_weights(weight_file)

MORPH2 dataset

hello,
How to get the MORPH dataset ? If I want to use it as academic purpose ,can I apply the dataset by sending an official Email or I HAVE TO pay for it ?
thank you .

请问您在预处理数据的时候对图像进行人脸识别并裁剪了吗?

您好,在预处理数据(create_db.py)时,我注意到你对原本数据集的图像进行了人脸识别,但好像并未对人脸部分进行裁剪。(若是看错,还请谅解)
当我利用MegaAsian的数据集进行训练时,我发现该数据集人脸外的背景并不小,担心“噪声”较大,这样训练的结果是否会对结果造成影响呢?(我裁剪后,对于小孩的判断有一定提升)
不知道您的看法如何?谢谢。

Inference time

Hi,shamangary:
how did you evaluate your cpu time?I use the following code to evaluate the inference time:
start_CPU = timeit.default_timer()
results = model.predict(faces)
end_CPU = timeit.default_timer()
in the TYY_demo_mtcnn.py code,but it takes about 20-30ms in cpu time,is it correct?

What are `lambda_local` and `lambda_d` before $\eta$ and $\Delta_k$ respectively?

Hello. In paper 3.3 Dymamic Range part,
formula (4)
$$\bar s_k = s_k (1 + \Delta_k) \tag{4}$$
formula (6)
$$\bar i = i + \eta_i^{(k)} \tag{6}$$

but there is sort of different in the code, the $\Delta_k$ and $\eta$ are multiplied by lambda_local and lambda_d respectively which (lambda_local and lambda_d) are assigned [0, 0.25, 0.5, 0.75, 1] in the SSRNET_train.py.

Why are they(lambda_local and lambda_d) exist?
Thank you.

Test image have margin can improve accuracy!

Hello!
I used megaage asian to train model, and use my images to test. My images was detected faces with MTCNN and saved with margin! When I use the trained model to test the gender with MTCNN detecting face again, and the results bad, but use images with margin (don't detect face with MTCNN), the results very good!
Whether it is related to the training set containing boundaries?
Is it reasonable for me to test like this?
Thank you!

Training On IMBD

Hi,shamangary:
I use imdb data to train the model, use the parameter in your paper,but the training and val loss is very high,about 9.2,is that correct?

keras version

你好,我能不能咨询一下,你在进行训练和测试的时候,采用的keras和Python的版本是多少呢?谢谢!

project environment

hello ,would u plz give detailed information about the project environment ,such as keras and tensorflow version ,cuda and cudnn?
when i run "python3 SSRNET_train.py --input /home/xd/workspace/SSR-Net/data/imdb.npz --db imdb --netType1 4 --netType2 4", the folloing errors occur:
Epoch 1/90
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/home/xx/.local/lib/python3.5/site-packages/keras/engine/training.py", line 606, in data_generator_task
generator_output = next(self._generator)
File "/home/xx/workspace/SSR-Net/training_and_testing/TYY_generators.py", line 49, in data_generator_reg
yield augment_data(np.array(p)),np.array(q)
File "/home/xx/workspace/SSR-Net/training_and_testing/TYY_generators.py", line 27, in augment_data
images[i] = tf.contrib.keras.preprocessing.image.random_rotation(images[i], 20, row_axis=0, col_axis=1, channel_axis=2)
AttributeError: module 'tensorflow.contrib' has no attribute 'keras'
File "/home/xx/.local/lib/python3.5/site-packages/keras/engine/training.py", line 1852, in fit_generator
str(generator_output))
ValueError: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: None
Your reply would be appreciated greatly.

Training error

I have some problems to start training. See 2 errors below.

Error 1: it is looking for a pre-trained weights but not available from repo
cmd:> python SSRNET_train.py --input ../data/imdb_db.npz --db imdb--netType1 1 --netType2 1

OSError: Unable to open file (Unable to open file: name = 'imdb_models/ssrnet_3_3_3_64_0.25_0.25/ssrnet_3_3_3_64_0.25_0.25.h5', errno = 2, error message = 'no such file or directory', flags = 0, o_flags = 0)

Error 2: I solve error1 by replace the imdb_models with from another pre-train folder, I hv picked "ssrnet_3_3_3_64_1.0_1.0", it throws me another error below:

UnicodeDecodeError: 'rawunicodeescape' codec can't decode bytes in position 838-839: truncated \uXXXX

Details:

Total params: 40,915
Trainable params: 40,531
Non-trainable params: 384


DEBUG:root:Saving model...
Traceback (most recent call last):
File "SSRNET_train.py", line 136, in
main()
File "SSRNET_train.py", line 95, in main
f.write(model.to_json())
File "C:\Users\default.LAPTOP-2CI68M4P\Anaconda3\envs\xrvision2\lib\site-packages\keras\engine\topology.py", line 2618, in to_json
model_config = self._updated_config()
File "C:\Users\default.LAPTOP-2CI68M4P\Anaconda3\envs\xrvision2\lib\site-packages\keras\engine\topology.py", line 2585, in _updated_config
config = self.get_config()
File "C:\Users\default.LAPTOP-2CI68M4P\Anaconda3\envs\xrvision2\lib\site-packages\keras\engine\topology.py", line 2322, in get_config
layer_config = layer.get_config()
File "C:\Users\default.LAPTOP-2CI68M4P\Anaconda3\envs\xrvision2\lib\site-packages\keras\layers\core.py", line 656, in get_config
function = func_dump(self.function)
File "C:\Users\default.LAPTOP-2CI68M4P\Anaconda3\envs\xrvision2\lib\site-packages\keras\utils\generic_utils.py", line 175, in func_dump
code = marshal.dumps(func.code).decode('raw_unicode_escape')
UnicodeDecodeError: 'rawunicodeescape' codec can't decode bytes in position 838-839: truncated \uXXXX

about the results of MegaAgeAsian

image
In your paper, you got this result, is that on the training datasets or validation datasets? Cause I reimplement the SSRNet in pytorch, but the best results of 90 epochs is like :

train Loss: 22.0870 CA_3: 0.5108, CA_5: 0.7329
val Loss: 44.7439 CA_3: 0.4268, CA_5: 0.6225

the parameters are like:

batch_size = 50
input_size = 64
num_epochs = 90
learning_rate = 0.001 # originally 0.001
weight_decay = 1e-4 # originally 1e-4
augment = False
optimizer_ft = optim.Adam(params_to_update, lr=learning_rate, weight_decay=weight_decay)
criterion = nn.MSELoss()
lr_scheduler = optim.lr_scheduler.StepLR(optimizer_ft, step_size=30, gamma=0.1)

Thanks in advance~

mtcnn

hello, I find you use mtcnn to detect face in video demo code, and install mtcnn with pip. So I want to ask you, this mtcnn whether or not Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks(MTCNN)?
thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.