zitongyu / cdcn Goto Github PK
View Code? Open in Web Editor NEWCentral Difference Convolutional Networks (CVPR'20)
License: Other
Central Difference Convolutional Networks (CVPR'20)
License: Other
您好,感谢您开源工作代码,我使用CDCN这个模型进行跨库实验,得到如下的结果(casia的real人脸大部分都判定为fake),
TP: 7
TN: 196
FP: 74
FN: 83
Accuracy: 0.5638888888888889
FAR: 0.2740740740740741
FRR: 0.9222222222222223
HTER: 0.5981481481481482
HTER与论文差距较大,想问下您当时有遇到这种问题吗,如果有的话是如何解决的。第二个问题是请问您训练模型时,是在loss最小时保存模型,还是在什么时候保存模型?十分感谢,期待您的回复!
Hi, @ZitongYu thanks for your codes, I really appreciate it.
I checked the Tensorflow version of CDC Conv op just now, I found that the definition of CDC Conv in TensorFlow implementation is not the same as the formula you proposed in your paper. You used
kernel_diff = tf.tile(kernel_diff, [kernel_size, kernel_size, 1, 1])
in the kernel_diff line, you copied the sum of kernel weights to a 3x3 weights filter, so the final result based on the TensorFlow version is not a 1x1 convolution on the current center of the receptive field. Can you explain why this happened?
Why is the relu activation function being used as the final activation in the lastconv layers? Shouldn't it be sigmoid as the values of our depth map needs to be in the range of 0-255. ReLU is unbounded function, so wouldn't sigmoid be better activation for depth map generation?
您好,感谢您开源工作代码,我想知道关于replay和casia的跨库实验中,每个测试数据集的结果判别的threshold如何获得(对于replay数据集,通过他的dev部分获得嘛?那么casia呢,自己将测试集分成dev和test嘛),或者说是通过其他方法进行每个测试集video的结果判别?。十分感谢,期待您的回复!
您好,咨询下您,单模型关于Spoofing_train(train_list, image_dir)当中的image_dir是用prnet生成的3d数据吗,这样的话是不是直接训练没有用到原始图片,而是prnet生成的3d数据?
Hi @ZitongYu,
When using Patch Exchange Augmentation, what the face image size using to extract patches, the original face detected, or resized face (ex 256x256)?
I try to train 50 epoch and then test on test set. I have a problem with this code.
score_norm = torch.sum(map_x)/torch.sum(val_maps[:,frame_t,:,:])
I found for spoof samples, this denominator is always zero. Then, it may lead to score_norm = inf
.
Traceback (most recent call last):
File "train_CDCN.py", line 445, in <module>
train_test()
File "train_CDCN.py", line 409, in train_test
val_threshold, test_threshold, val_ACC, val_ACER, test_ACC, test_APCER, test_BPCER, test_ACER, test_ACER_test_threshold = performances(map_score_val_filename, map_score_test_filename)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Can you help solve this problem? I'm confused about the meaning of this line of code
parser = argparse.ArgumentParser(description="save quality using landmarkpose model")
parser.add_argument('--gpu', type=int, default=3, help='the gpu id used for predict')
parser.add_argument('--lr', type=float, default=0.00008, help='initial learning rate') #default=0.0001
parser.add_argument('--batchsize', type=int, default=9, help='initial batchsize') #default=7
parser.add_argument('--step_size', type=int, default=20, help='how many epochs lr decays once') # 500 | DPC = 400
parser.add_argument('--gamma', type=float, default=0.5, help='gamma of optim.lr_scheduler.StepLR, decay of lr')
parser.add_argument('--echo_batches', type=int, default=50, help='how many batches display once') # 50
parser.add_argument('--epochs', type=int, default=60, help='total training epochs')
parser.add_argument('--log', type=str, default="CDCNpp_BinaryMask_P1_07", help='log and save model name')
parser.add_argument('--finetune', action='store_true', default=False, help='whether finetune other models')
parser.add_argument('--theta', type=float, default=0.7, help='hyper-parameters in CDCNpp')
according you parser, total epoch is 60 , lr is 8e-5 and bs is 9?
When I train on oulu Proro1, I find the loss hard to converge.
Here is part of my train log.The ACER fluctuates near 5%, though I have tried to adjust lr.
epoch:345, Val: val_threshold= 0.1000, val_ACC= 0.9878, val_ACER= 0.0139
epoch:345, Test: ACC= 0.9567, APCER= 0.0167, BPCER= 0.1500, ACER= 0.0833
epoch:346, Train: Absolute_Depth_loss= 0.0072, Contrastive_Depth_loss= 0.0031
epoch:347, Train: Absolute_Depth_loss= 0.0077, Contrastive_Depth_loss= 0.0031
epoch:348, Train: Absolute_Depth_loss= 0.0063, Contrastive_Depth_loss= 0.0030
epoch:349, Train: Absolute_Depth_loss= 0.0072, Contrastive_Depth_loss= 0.0031
epoch:350, Train: Absolute_Depth_loss= 0.0073, Contrastive_Depth_loss= 0.0031
epoch:350, Val: val_threshold= 0.1673, val_ACC= 0.9878, val_ACER= 0.0139
epoch:350, Test: ACC= 0.9733, APCER= 0.0063, BPCER= 0.1083, ACER= 0.0573
epoch:351, Train: Absolute_Depth_loss= 0.0075, Contrastive_Depth_loss= 0.0031
epoch:352, Train: Absolute_Depth_loss= 0.0069, Contrastive_Depth_loss= 0.0030
epoch:353, Train: Absolute_Depth_loss= 0.0080, Contrastive_Depth_loss= 0.0031
epoch:354, Train: Absolute_Depth_loss= 0.0071, Contrastive_Depth_loss= 0.0031
epoch:355, Train: Absolute_Depth_loss= 0.0077, Contrastive_Depth_loss= 0.0031
Could you share your train.log or provide any advice?
您好,
您在CVPR2020_paper_codes/train_CDCN.py 中test阶段,使用了binary map from PRNet,用作norm score
请问为什么要利用测试数据对应的深度图,用来norm score?
code:
“test_maps = sample_batched['val_map_x'].cuda() # binary map from PRNet”
“score_norm = torch.sum(map_x)/torch.sum(test_maps[:,frame_t,:,:])”
Hi @ZitongYu
I'd like to know how long you train the model with 800 epochs?
In my experiment, I trained 1 epoch (batch size is 8 and 20000 steps) spending 12 hours on a single GPU (P100).
It's so long and I think something that went wrong, any suggestion for me?
Hi,
I got access to the OULU dataset, and it contains .txt and .avi files. This repo refers to image and .dat files as (1_1_22_4_107_scene.dat). How can I get these files?
(https://github.com/ZitongYu/CDCN/blob/master/CVPR2020_paper_codes/Load_OULUNPU_train.py#L281)
Thank you!
Hi, Thank you share your great work. I found that the binary mask is used in this CVPRW competition. How much do you think this alternative method will affect your algorithm performance?
for i in range(32):
for j in range(32):
if image_x_temp_gray[i,j]>0:
binary_mask[i,j]=1
else:
binary_mask[i,j]=0
感谢您的开源。有个疑问,为何这里是使用灰度图生产mask,而不是直接使用depth map?
First of all, thank you very much for providing your code.
Reading through, i realized that you have some files with depth maps and others with bounding boxes, how did you obtain this? as the OULU dataset does not include any of them (just info of the position of the eyes).
Thank you
def get_single_image_x(self, image_path, map_path, videoname):
frames_total = len([name for name in os.listdir(map_path) if os.path.isfile(os.path.join(map_path, name))])
# random choose 1 frame
for temp in range(500):
image_id = np.random.randint(1, frames_total-1)
s = "_%03d_scene" % image_id
image_name = videoname + s + '.jpg'
bbox_name = videoname + s + '.dat'
bbox_path = os.path.join(image_path, bbox_name)
s = "_%03d_depth1D" % image_id
map_name = videoname + s + '.jpg'
map_path2 = os.path.join(map_path, map_name)
/home/ztyu/FAS_dataset/OULU/IJCB_re/OULUtrain_images/ 6_3_20_5_121_depth1D.jpg
According to your code, do we need to generate these black images?
thank you~
Why is this line not assigned to "temp" variable?
It looks this line is redundant.
Thanks your time to watch this issue.
不开放预训练模型,你开源个蛋蛋
Hi,
I'm not able to reproduce results on OULU dataset.
Even after 700 epochs , the loss as low as mentioned in other issue.
epoch:721, Test: ACC= 0.4100, APCER= 0.7375, BPCER= 0.0000, ACER= 0.3688
epoch:722, mini-batch: 50, lr=0.000050, Absolute_Depth_loss= 0.0279, Contrastive_Depth_loss= 0.0032
epoch:722, mini-batch:100, lr=0.000050, Absolute_Depth_loss= 0.0264, Contrastive_Depth_loss= 0.0030
epoch:722, mini-batch:150, lr=0.000050, Absolute_Depth_loss= 0.0278, Contrastive_Depth_loss= 0.0031
epoch:722, Train: Absolute_Depth_loss= 0.0271, Contrastive_Depth_loss= 0.0031
Differences from original setup:
Can you please let me know what is train list, test list and val list in train.py in CVPR2020 paper codes? Also, some documentation about how to train the model would be helpful.
Hi!
Thank you for your work. Can you please provide the NAS code? If not, please answer the following questions:
Thanks.
By comparing and training CDCN, it is found that the accuracy of general convolution using Theta =0 is one point higher than that of Theta =0.7. Does anyone have the same result as me
Nevermind. I got it. Have a nice day.
Hi,
Thank you for sharing your code. I am trying to replicate your SiW_M experiments but are getting different numbers for certain test spoof types. I think this might be due to difference in the way we have pre-processed the videos to obtain face-centered frames (and depth masks). Can you please provide details regarding the steps you took to obtain the training (and testing) data for your model from the videos in the SiW_M dataset?
Thank you!
ignore
Thanks for your contribution with the NAS-FAS paper!
I like the idea of aggregating the video input into a Static-Dynamic image, but something on RankSVM is not clear for me.
How did u implement RankSVM to obtain the Dynamic image?
Is the Dynamic image "D" is the weight of the SVM algorithm?
I am trying to follow this idea of RankSVM: https://www.cnblogs.com/bentuwuying/p/6683832.html
So, you ended just preprocessing the frames into a mean image S_i (using K frames) and then subtracting them to create the proper input to RankSVM? Did you also subtract in the opposite way to obtain negative labels?
After preparing data this way you just trained an SVM with sci-kit learn using Hinge loss? After training for what I understood, you want to obtain D that is exactly the parameters of SVM. Therefore, suppose I am using scikit-learn for example, I just need to use the get_params method and then perform the max-min normalization to obtain the Static-Dynamic image?
你好,我使用https://github.com/clks-wzz/PRNet-Depth-Generation产生深度图
他返回的数值是y1,x1,w,h
你的代码里面也是y2=y1+w,但是正常情况下应该是w和x一块计算的,不应该是x2=x1+w?
In #14 you mention that the MTCNN method is used for face detection and cropping.
According to https://github.com/timesler/facenet-pytorch/blob/master/models/mtcnn.py, the MTCNN network required threshold values for face detection.
What threshold values did you choose for the network and whether it was adapted or fixed ?
Also, with your thresholds, did you encounter empty or wrong face detections and how did you deal with those images ?
Hello, I train the Track2 Single-modal with my dataset which include 5000+ real and 5000+ fake pics. The epoch is 60, and train log as follow:
Oulu-NPU, P1:
train from scratch!
epoch:1, Train: Absolute_Depth_loss= 0.2492, Contrastive_Depth_loss= 0.0112
epoch:2, Train: Absolute_Depth_loss= 0.1995, Contrastive_Depth_loss= 0.0083
epoch:3, Train: Absolute_Depth_loss= 0.1734, Contrastive_Depth_loss= 0.0087
epoch:4, Train: Absolute_Depth_loss= 0.1561, Contrastive_Depth_loss= 0.0088
epoch:5, Train: Absolute_Depth_loss= 0.1435, Contrastive_Depth_loss= 0.0089
epoch:6, Train: Absolute_Depth_loss= 0.1334, Contrastive_Depth_loss= 0.0089
epoch:7, Train: Absolute_Depth_loss= 0.1266, Contrastive_Depth_loss= 0.0090
epoch:8, Train: Absolute_Depth_loss= 0.1210, Contrastive_Depth_loss= 0.0091
epoch:9, Train: Absolute_Depth_loss= 0.1133, Contrastive_Depth_loss= 0.0090
epoch:10, Train: Absolute_Depth_loss= 0.1085, Contrastive_Depth_loss= 0.0091
epoch:11, Train: Absolute_Depth_loss= 0.1039, Contrastive_Depth_loss= 0.0092
epoch:12, Train: Absolute_Depth_loss= 0.0988, Contrastive_Depth_loss= 0.0092
epoch:13, Train: Absolute_Depth_loss= 0.0939, Contrastive_Depth_loss= 0.0092
epoch:14, Train: Absolute_Depth_loss= 0.0907, Contrastive_Depth_loss= 0.0091
epoch:15, Train: Absolute_Depth_loss= 0.0875, Contrastive_Depth_loss= 0.0092
epoch:16, Train: Absolute_Depth_loss= 0.0833, Contrastive_Depth_loss= 0.0091
epoch:17, Train: Absolute_Depth_loss= 0.0810, Contrastive_Depth_loss= 0.0091
epoch:18, Train: Absolute_Depth_loss= 0.0791, Contrastive_Depth_loss= 0.0090
epoch:19, Train: Absolute_Depth_loss= 0.0768, Contrastive_Depth_loss= 0.0090
epoch:20, Train: Absolute_Depth_loss= 0.0622, Contrastive_Depth_loss= 0.0084
epoch:21, Train: Absolute_Depth_loss= 0.0591, Contrastive_Depth_loss= 0.0085
epoch:22, Train: Absolute_Depth_loss= 0.0562, Contrastive_Depth_loss= 0.0084
epoch:23, Train: Absolute_Depth_loss= 0.0543, Contrastive_Depth_loss= 0.0085
epoch:24, Train: Absolute_Depth_loss= 0.0532, Contrastive_Depth_loss= 0.0084
epoch:25, Train: Absolute_Depth_loss= 0.0511, Contrastive_Depth_loss= 0.0083
epoch:26, Train: Absolute_Depth_loss= 0.0497, Contrastive_Depth_loss= 0.0083
epoch:27, Train: Absolute_Depth_loss= 0.0486, Contrastive_Depth_loss= 0.0083
epoch:28, Train: Absolute_Depth_loss= 0.0465, Contrastive_Depth_loss= 0.0082
epoch:29, Train: Absolute_Depth_loss= 0.0455, Contrastive_Depth_loss= 0.0082
epoch:30, Train: Absolute_Depth_loss= 0.0445, Contrastive_Depth_loss= 0.0081
epoch:31, Train: Absolute_Depth_loss= 0.0441, Contrastive_Depth_loss= 0.0082
epoch:32, Train: Absolute_Depth_loss= 0.0424, Contrastive_Depth_loss= 0.0081
epoch:33, Train: Absolute_Depth_loss= 0.0413, Contrastive_Depth_loss= 0.0080
epoch:34, Train: Absolute_Depth_loss= 0.0416, Contrastive_Depth_loss= 0.0080
epoch:35, Train: Absolute_Depth_loss= 0.0401, Contrastive_Depth_loss= 0.0079
epoch:36, Train: Absolute_Depth_loss= 0.0396, Contrastive_Depth_loss= 0.0080
epoch:37, Train: Absolute_Depth_loss= 0.0391, Contrastive_Depth_loss= 0.0079
epoch:38, Train: Absolute_Depth_loss= 0.0389, Contrastive_Depth_loss= 0.0079
epoch:39, Train: Absolute_Depth_loss= 0.0355, Contrastive_Depth_loss= 0.0077
epoch:40, Train: Absolute_Depth_loss= 0.0307, Contrastive_Depth_loss= 0.0073
epoch:41, Train: Absolute_Depth_loss= 0.0298, Contrastive_Depth_loss= 0.0072
epoch:42, Train: Absolute_Depth_loss= 0.0289, Contrastive_Depth_loss= 0.0072
epoch:43, Train: Absolute_Depth_loss= 0.0286, Contrastive_Depth_loss= 0.0071
epoch:44, Train: Absolute_Depth_loss= 0.0278, Contrastive_Depth_loss= 0.0071
epoch:45, Train: Absolute_Depth_loss= 0.0266, Contrastive_Depth_loss= 0.0070
epoch:46, Train: Absolute_Depth_loss= 0.0269, Contrastive_Depth_loss= 0.0070
epoch:47, Train: Absolute_Depth_loss= 0.0264, Contrastive_Depth_loss= 0.0070
epoch:48, Train: Absolute_Depth_loss= 0.0266, Contrastive_Depth_loss= 0.0070
epoch:49, Train: Absolute_Depth_loss= 0.0250, Contrastive_Depth_loss= 0.0069
epoch:50, Train: Absolute_Depth_loss= 0.0258, Contrastive_Depth_loss= 0.0070
epoch:51, Train: Absolute_Depth_loss= 0.0245, Contrastive_Depth_loss= 0.0068
epoch:52, Train: Absolute_Depth_loss= 0.0237, Contrastive_Depth_loss= 0.0067
epoch:53, Train: Absolute_Depth_loss= 0.0241, Contrastive_Depth_loss= 0.0069
epoch:54, Train: Absolute_Depth_loss= 0.0233, Contrastive_Depth_loss= 0.0068
epoch:55, Train: Absolute_Depth_loss= 0.0236, Contrastive_Depth_loss= 0.0067
epoch:56, Train: Absolute_Depth_loss= 0.0228, Contrastive_Depth_loss= 0.0067
epoch:57, Train: Absolute_Depth_loss= 0.0226, Contrastive_Depth_loss= 0.0067
epoch:58, Train: Absolute_Depth_loss= 0.0218, Contrastive_Depth_loss= 0.0066
epoch:59, Train: Absolute_Depth_loss= 0.0216, Contrastive_Depth_loss= 0.0066
epoch:60, Train: Absolute_Depth_loss= 0.0193, Contrastive_Depth_loss= 0.0063
when I use the saved model file to test. I find some images perform poorly, Fake img is recongize to real. The accuracy about 24%. Is there overfitting? or another reason?
Another question, the pics which use to train is raw image or crop face? Is there any difference between them?
Thank you very much.
有没有训练好的模型可以分享一下?我发现作者的代码训练起来很慢,主要是数据处理上费事,想看看有没有人有训练好的模型,谢谢分享,可以发我邮箱[email protected]
epoch:109, Train: Absolute_Depth_loss= 0.0124, Contrastive_Depth_loss= 0.0041
[110] batch: 50, lr=0.000100, Absolute_Depth_loss= 0.0120, Contrastive_Depth_loss= 0.0039 batch_spend: 18.051 18.055 18.154
[110] batch:100, lr=0.000100, Absolute_Depth_loss= 0.0129, Contrastive_Depth_loss= 0.0043 batch_spend: 36.509 36.514 36.612
[110] batch:150, lr=0.000100, Absolute_Depth_loss= 0.0131, Contrastive_Depth_loss= 0.0042 batch_spend: 54.745 54.749 54.848
这三个时间分别是:(记录现在时间t0)读取数据 (时间1记录与t0的差)过网络 (时间2记录与t0的差)backward (时间3记录与t0的差)
好心人可以发一份oulu-npu数据集给我吗
我的qq是[email protected]
Could somebody nice give me oulu-npu dataset?
My email is [email protected]
谢谢.
Hi, I have one question about loss in the paper. I realize that you used MSE loss and Constrast Depth loss. When I use tensorboard to visualize the loss, I saw that sometimes total loss became very high. I don't understand why it happend? Is it a normal case. I use CelebA-Spoofing data for training
e?
I have problem to set the dataset and call them, would you please inform how to do that @ZitongYu ? thank you before
could you please give the the development environment or give a requirement.txt?
在计算最优阈值时,输入input怎么是5维的,我用自己的数据跑程序,数据是4维,frame_t是通道数吗
您好!麻烦一下,请问关于NAS部分的代码方便分享吗? 多谢啦!
最近一直想用这个集,但是比赛已经截止了,申请不到。博主可以分享一下吗,麻烦您了!
感激不尽!
有没有训练好的模型可以分享一下?着急用,谢谢分享,可以发我邮箱[email protected]
因为原始数据库都是视频,您是每一帧都做了抽帧,还是隔几帧抽一次?
Hi author,
I have one question about normalization ? Currently I use Image data for training, not for video as you. When I run inference code for video by getting frames, the model can't detect spoofing object. I realize that you use normalization for your frame.
I wonder does it affects to the performance of model ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.