Git Product home page Git Product logo

cosface's People

Contributors

yule-li avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cosface's Issues

Training code should be modified if multiple GPUs are used

When I use four GPUs to training cosface model, exception occurs:

ValueError: Variable conv1_/conv2d/kernel already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

  File "networks/sphere_network.py", line 49, in first_conv
    network = tf.layers.conv2d(input, num_output, kernel_size = [3, 3], strides = (2, 2), padding = 'same', kernel_initializer = xavier, bias_initializer = zero_init, kernel_regularizer = l2_regularizer, bias_regularizer = l2_regularizer)
  File "networks/sphere_network.py", line 14, in infer
    network = first_conv(input, 64, name = 'conv1')
  File "train/train_multi_gpu.py", line 197, in main
    prelogits = network.infer(batch_image_split[i], args.embedding_size)

prelogits = network.infer(batch_image_split[i],args.embedding_size) construct graph for every GPU and there is no resue setting. It should be modified to that :
with tf.variable_scope(name_or_scope='', reuse=tf.AUTO_REUSE):
prelogits = network.infer(batch_image_split[i],args.embedding_size)

what is the difference between calculate cosloss using py_func and tf api in utils.py line 104-109?

Hi,I have a question, what is the difference as follows:
#implemented by py_func
#value = tf.identity(xw)
#substract the marigin and scale it
value = coco_func(xw_norm,y,alpha) * scale

#implemented by tf api
#margin_xw_norm = xw_norm - alpha
#label_onehot = tf.one_hot(y,num_cls)
#value = scale*tf.where(tf.equal(label_onehot,1), margin_xw_norm, xw_norm)

Implemented by py_func is more complex,you should calculate grad by youself ,and implemented by tf api is simple,only 3 lines. And have any difference between the two?

Some details confuses me.

hi, first excellent job! but I have some problems.

  1. In train_multi_gpu.py, your '_parse_function' use
image = tf.subtract(image,127.5)
image = tf.div(image,128.)

However, when testing, you use prewhiten function

def prewhiten(x):
    mean = np.mean(x)
    std = np.std(x)
    std_adj = np.maximum(std, 1.0/np.sqrt(x.size))
    y = np.multiply(np.subtract(x, mean), 1/std_adj)
    return y  

It is a little different from above! Will these two different operations of image preprocessing degrades the performance when testing?

  1. thank you for providing the pretrained model! Will you release the newest model?

Could you tell me how to solve the error: too few arguments?

$ python test.py
usage: test.py [-h] [--network_type NETWORK_TYPE] [--fc_bn] [--prewhiten]
[--save_model SAVE_MODEL] [--do_flip DO_FLIP]
[--lfw_batch_size LFW_BATCH_SIZE]
[--embedding_size EMBEDDING_SIZE] [--image_size IMAGE_SIZE]
[--image_height IMAGE_HEIGHT] [--image_width IMAGE_WIDTH]
[--lfw_pairs LFW_PAIRS] [--lfw_file_ext {jpg,png}]
[--lfw_nrof_folds LFW_NROF_FOLDS] [--model_def MODEL_DEF]
lfw_dir model
test.py: error: too few arguments

Could you tell me how to test the model that you ware trained?

When I debug "main(parse_arguments(sys.argv[1:]))"

NotFoundError (see above for traceback): Key resnet_v2_50/block3/unit_2/bottleneck_v2/conv3/weights not found in checkpoint
[[Node: save/RestoreV2_145 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_145/tensor_names, save/RestoreV2_145/shape_and_slices)]]
Could someone tell me who to slove the error.

Could you tell me your trained model is trained by which network structure?

I can test your upload model in the test.sh.
but i retrain the model by train.sh, i found i could not test in my correct model path. So i want to know your model in which structures...
--network_type sphere_network in the test.sh mean i should train the model by #NETWORK=sphere_network right? Thanks very much~

license of the project

Thank you so much for this project. I might miss but I could not see the license declaration of this project. What is the license of cosface? Could you add this information in read me and put a dedicated license file in the root of the repo.

请问这个如何处理

python test.py
2019-05-08 08:59:41.657635: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-05-08 08:59:41.751877: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-05-08 08:59:41.752518: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce 930M major: 5 minor: 0 memoryClockRate(GHz): 0.941
pciBusID: 0000:04:00.0
totalMemory: 1.96GiB freeMemory: 1.76GiB
2019-05-08 08:59:41.752549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2019-05-08 08:59:42.368344: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-08 08:59:42.368376: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2019-05-08 08:59:42.368400: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2019-05-08 08:59:42.368564: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1518 MB memory) -> physical GPU (device: 0, name: GeForce 930M, pci bus id: 0000:04:00.0, compute capability: 5.0)
Skipped 6000 image pairs
image size 224
Traceback (most recent call last):
File "test.py", line 158, in
main(parse_arguments(sys.argv[1:]))
File "test.py", line 49, in main
with slim.arg_scope(resnet_v2.resnet_arg_scope(False)):
File "/home/wlc/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_utils.py", line 257, in resnet_arg_scope
weights_regularizer=regularizers.l2_regularizer(weight_decay),
File "/home/wlc/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/regularizers.py", line 92, in l2_regularizer
raise ValueError('scale cannot be an integer: %s' % (scale,))
ValueError: scale cannot be an integer: False

cosine similarity or euclidean distance for evaluation?

hi, in the paper it stated that "In the testing phase, the testing data is fed into CosFace to extract face features which are later used to compute the cosine similarity score to perform face verification and identification", but it seems like you are using euclidean distance in the test.py. Is there anything i got it wrong?

OOM error

Allocator (GPU_0_bfc) ran out of memory trying to allocate 12.0 kiB (rounded to 12288) as requested by op tower_0/l2_normalize_1.

I tried to reduce batch_size, max_nrof_epochs, images_per_person, etc. Nothing works. Can anyone help me?

trouble when loading pre-trained model

Hello,

When I try to restore the model from Google Drive link with named model-20180626-205832.ckpt-60000 it got the following error.

Unable to open table file models\model-20180626-205832.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

tensorflow: 1.9.0
python: 3.5.5

I think that tensorflow requires .data, .index and .meta files in the same directory. However, google drive link just downloads "model-20180626-205832.ckpt-60000.data-00000-of-00001"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.