yule-li / cosface Goto Github PK

View Code? Open in Web Editor NEW

285.0 285.0 99.0 1.02 MB

Tensorflow implementation for paper CosFace: Large Margin Cosine Loss for Deep Face Recognition

Python 95.91% Shell 4.09%

cosine-loss face recongnition tensorflow

cosface's People

Contributors

Stargazers

Watchers

Forkers

pbdahzou chuengminchou yuemengjianxin wang-mengjiao hurmean yang-fei guowenzhe dearbigdog wxb506 samxuxiang robi56 tianconghua xggiou deyituo akofman summer1914 qiaoxie uptodiff nuaa-qk winroot liaoheping zumbalamambo hxl1990 alexwongdl yihaowei chengchengowen anjiang2016 shadow8090 chenyyx zxx4477 yushenxiang dltensor jiao133 johngalt18 yuzhile jeremmyzong drmouse-dlc xiongshufeng chanbluky klqulei wuyx guojiapeng00 shaoyandea peternara shuxjweb freedommy fzylx yjingyu weixin7788 rex-hou zgsxwsdxg blueeaglex deepinx deardream000 tcwltcwl yanbofan aj1m0n liuxuan001 zjuqiushi rentiejun anigi98932 searchmodel feesics onlynata connectsoumya cscn89 llllss xiaowenhe ww2401 jonyboy2000 henggezhizou fridayjk bilibotter ledduy610 zouxiaoyuonly jonnyr2008 ffeiding maomingqian junweston freegliboracle liuqiaoping7 wblossom tokyo-123 sahil-iit guker xiazhiwu-hub wencoast qinzhengmei hanyuxu1 syd-fv 13301338176 samimahfoud shen7me pomo1144 elvinbaghele mg0105 legendtime1

cosface's Issues

How to train and test my own dataset

Hi I am a newbie, can you help me how to train and test my own dataset using cosface

Where is the list_file of CASIA-WebFace-112X96 dataset

I download the preprocessed dataset of CASIA-WebFace-112X96, however I didn't get the list_file for training. Could you provide me the it, or give some methods?

Error with pre-trained

When I run ./test.sh I have an error

Training code should be modified if multiple GPUs are used

When I use four GPUs to training cosface model, exception occurs:

ValueError: Variable conv1_/conv2d/kernel already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

  File "networks/sphere_network.py", line 49, in first_conv
    network = tf.layers.conv2d(input, num_output, kernel_size = [3, 3], strides = (2, 2), padding = 'same', kernel_initializer = xavier, bias_initializer = zero_init, kernel_regularizer = l2_regularizer, bias_regularizer = l2_regularizer)
  File "networks/sphere_network.py", line 14, in infer
    network = first_conv(input, 64, name = 'conv1')
  File "train/train_multi_gpu.py", line 197, in main
    prelogits = network.infer(batch_image_split[i], args.embedding_size)

prelogits = network.infer(batch_image_split[i],args.embedding_size) construct graph for every GPU and there is no resue setting. It should be modified to that :
with tf.variable_scope(name_or_scope='', reuse=tf.AUTO_REUSE):
prelogits = network.infer(batch_image_split[i],args.embedding_size)

what is the difference between calculate cosloss using py_func and tf api in utils.py line 104-109?

Hi,I have a question, what is the difference as follows:
#implemented by py_func
#value = tf.identity(xw)
#substract the marigin and scale it
value = coco_func(xw_norm,y,alpha) * scale

#implemented by tf api
#margin_xw_norm = xw_norm - alpha
#label_onehot = tf.one_hot(y,num_cls)
#value = scale*tf.where(tf.equal(label_onehot,1), margin_xw_norm, xw_norm)

Implemented by py_func is more complex,you should calculate grad by youself ,and implemented by tf api is simple,only 3 lines. And have any difference between the two?

Possible upload the model and data to google drive/dropbox?

Thanks for sharing Your work , I am working on a linux machine, so i'm finding it difficult to download the model and data. I was hoping if someone could please upload the model/data to google drive or dropbox.

Thanks in advance.

Some details confuses me.

hi, first excellent job! but I have some problems.

In train_multi_gpu.py, your '_parse_function' use

image = tf.subtract(image,127.5)
image = tf.div(image,128.)

However, when testing, you use prewhiten function

def prewhiten(x):
    mean = np.mean(x)
    std = np.std(x)
    std_adj = np.maximum(std, 1.0/np.sqrt(x.size))
    y = np.multiply(np.subtract(x, mean), 1/std_adj)
    return y

It is a little different from above! Will these two different operations of image preprocessing degrades the performance when testing?

thank you for providing the pretrained model! Will you release the newest model?

There is too many versions. Can anyone tell me which is the best from the model provided until now .

model-20180309-083949.ckpt-60000 ? model-20180626-205832.ckpt-60000 ?
and how to choose the parameters ?

Could you tell me how to solve the error: too few arguments?

$ python test.py
usage: test.py [-h] [--network_type NETWORK_TYPE] [--fc_bn] [--prewhiten]
[--save_model SAVE_MODEL] [--do_flip DO_FLIP]
[--lfw_batch_size LFW_BATCH_SIZE]
[--embedding_size EMBEDDING_SIZE] [--image_size IMAGE_SIZE]
[--image_height IMAGE_HEIGHT] [--image_width IMAGE_WIDTH]
[--lfw_pairs LFW_PAIRS] [--lfw_file_ext {jpg,png}]
[--lfw_nrof_folds LFW_NROF_FOLDS] [--model_def MODEL_DEF]
lfw_dir model
test.py: error: too few arguments

Can't download trained model from Baidu.....can you please provide google drive link....Thanks for help

Could you tell me how to test the model that you ware trained?

When I debug "main(parse_arguments(sys.argv[1:]))"

NotFoundError (see above for traceback): Key resnet_v2_50/block3/unit_2/bottleneck_v2/conv3/weights not found in checkpoint
[[Node: save/RestoreV2_145 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_145/tensor_names, save/RestoreV2_145/shape_and_slices)]]
Could someone tell me who to slove the error.

Could you tell me your trained model is trained by which network structure?

I can test your upload model in the test.sh.
but i retrain the model by train.sh, i found i could not test in my correct model path. So i want to know your model in which structures...
--network_type sphere_network in the test.sh mean i should train the model by #NETWORK=sphere_network right? Thanks very much~

license of the project

Thank you so much for this project. I might miss but I could not see the license declaration of this project. What is the license of cosface? Could you add this information in read me and put a dedicated license file in the root of the repo.

loss cannot decrease when training and some bugs in train_multi_gpu.py

hi, when I am training on the webface, I find that the loss cannot decrease.
My network is sphere network and the loss is softmax.
Can anyone tell me the loss when convergence and how many epochs you trained？
Thanks！

请问这个如何处理

python test.py
2019-05-08 08:59:41.657635: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-05-08 08:59:41.751877: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-05-08 08:59:41.752518: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce 930M major: 5 minor: 0 memoryClockRate(GHz): 0.941
pciBusID: 0000:04:00.0
totalMemory: 1.96GiB freeMemory: 1.76GiB
2019-05-08 08:59:41.752549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2019-05-08 08:59:42.368344: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-08 08:59:42.368376: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2019-05-08 08:59:42.368400: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2019-05-08 08:59:42.368564: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1518 MB memory) -> physical GPU (device: 0, name: GeForce 930M, pci bus id: 0000:04:00.0, compute capability: 5.0)
Skipped 6000 image pairs
image size 224
Traceback (most recent call last):
File "test.py", line 158, in
main(parse_arguments(sys.argv[1:]))
File "test.py", line 49, in main
with slim.arg_scope(resnet_v2.resnet_arg_scope(False)):
File "/home/wlc/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_utils.py", line 257, in resnet_arg_scope
weights_regularizer=regularizers.l2_regularizer(weight_decay),
File "/home/wlc/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/regularizers.py", line 92, in l2_regularizer
raise ValueError('scale cannot be an integer: %s' % (scale,))
ValueError: scale cannot be an integer: False

cosine similarity or euclidean distance for evaluation?

hi, in the paper it stated that "In the testing phase, the testing data is fed into CosFace to extract face features which are later used to compute the cosine similarity score to perform face verification and identification", but it seems like you are using euclidean distance in the test.py. Is there anything i got it wrong?

How to fix the error of the /CASIA-WebFace-112X96/0000045/001.jpg not exist?

AssertionError: file ../CASIA-WebFace-112X96/0000045/001.jpg not exist

cleaned_list.txt explaination

can you please include a header to explain cleaned_list.txt?

OOM error

Allocator (GPU_0_bfc) ran out of memory trying to allocate 12.0 kiB (rounded to 12288) as requested by op tower_0/l2_normalize_1.

I tried to reduce batch_size, max_nrof_epochs, images_per_person, etc. Nothing works. Can anyone help me?

can you provide the cleaned_list.txt ?

The results of my files are particularly bad. I hope you can provide your cleaned_list.txt file, thank you very much.

Error when unzipping dataset

I met this error when unzipping the provided dataset, could you please tell me how to solve this?

lack txt file: data/pairs.txt

hello, thanks for sharing this code. But you haven't given the pairs.txt file! Could you please share it?

trouble when loading pre-trained model

Hello,

When I try to restore the model from Google Drive link with named model-20180626-205832.ckpt-60000 it got the following error.

Unable to open table file models\model-20180626-205832.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

tensorflow: 1.9.0
python: 3.5.5

I think that tensorflow requires .data, .index and .meta files in the same directory. However, google drive link just downloads "model-20180626-205832.ckpt-60000.data-00000-of-00001"