This is a re-implementation of facenet paper. The starter code is available at https://github.com/davidsandberg/facenet
This code was tested on python 2.7 and tensorflow 1.7.0
Config.py contains all the meta information about the training/testing (e.g. batch_size, learning rate, pretrained_model selection, output_path etc.). Use this file to configure the training and do experiments. Config.py is initialized with the parameters given in starter code. Important attributes in config are listed below:
data_dir (root_dir for training data)
model_base_dir (path to save checkpoints)
log_base_dir (path to save training logs)
embedding_size (face embedding dimension, default=512)
distance_metric (which metric to use, euclidean or cosine similarity)
val_annotation (path to groundtruth annotation for evaluation, a sample file is provided)
As a preprocessing step, we extract the face from a given image. There are two available options for face detection "HARR" and "MTCNN". you can chose this by modifying config.py. The preprocessing step may take a while depending on the data size.
A trained model that achieves an accuracy of 0.937 is provided (20181104-191806/)
Modify the config.py file, and start the training, inference, validation using following commands:
For training,
python train.py
For inference,
python predict.py --input_path_a path\to\input_a
--input_path_b path\to\input_b
--multiple_pairs "false"
--out_json /path/to/save/predictions
For validation,
python validate_on_lfw.py
Expected training image directory structure (i.e. every folder should contain the images of a unique person), pass the root path by modifying config.py
root
├── person1
│ ├── person1_1
│ ├── person1_2
├── person2
│ ├── person2_1
│ ├── person2_2
Once the configuration is done, the training can be started.
Standard data augmentation such as rotation, flip, crop are used. You can adjust the flag of these augmentation from config file. Built in tensorflow functions have been used to perform the augmentation
Prediction module expects four parameters: input_a, input_b, multiple_pairs, out_json. The default values of multiple_pairs and out_json are "false" and "prediction.json" respectively. If multiple_pairs is true then input_a, input_b should be csv files with the path info of individual images (sample csv files are provided). The output is the similarity score of all the corrosponding pairs.
validate_on_lfw is forked as it is from the starter code. This is used to evaluate the performance of the network. The current version achieves an accuracy 0.937, with angle between the embeddings as the measure for similarity. The similarity score can be changed by updating config.py