vision-sjtu / recce Goto Github PK

View Code? Open in Web Editor NEW

104.0 104.0 14.0 144 KB

[CVPR2022] End-to-End Reconstruction-Classification Learning for Face Forgery Detection

License: MIT License

Python 100.00%

face-forgery-detection pytorch-implementation reconstruction-classification-learning

recce's People

Contributors

Stargazers

Watchers

Forkers

gang370 ttykelly dufq mrizkymunggaran hxsong22 modeliticai ai-ron-man yinghuozijin shenshouzhaixing shashalaha jireh-jam wennjiee hariseldon2021 charlestran

recce's Issues

how to split the wilddeepfake？

About processed dataset of WildDeepfake and DFDC

Hello. Thanks for your work very much.
I sent an email to [email protected] through my institutional email to you but it seems there was a problem that prevented the email from being sent. Can you send me the DFDC and WildDeepfake processed dataset?
My institutional email is [email protected] and my personal email is [email protected].
Wishing you all the best!

关于预处理的FF++数据集

我听说您已经处理过FF++数据集，由于CPU的限制，我并不能完成数据集的预处理，我非常希望能够获取您处理过的数据集，以满足我的研究需求。如果可能，我将万分感激！您可以通过电子邮件联系我，我的邮箱是：[email protected]

the result of the first epoch on Celeb-DF is very high

Hello!
Thank you for your excellent work. I use the preprogressed Celeb-DF dataset provided in #9 and start training successfully, but the result of the first epoch is: ACC 0.9712, AUC 0.9957, seems too high at the beginning. Is it normal? I noticed the program will spilt dataset into train set and test set before training, so is it necessary to split the dataset manually?
Thank you for your help, and waiting for your answer.

About training and pickle

Hi，I notice you said "we train a reconstruction network over genuine images only" I wang to know that is only real images are used during training phase？Or it means the input of the network contains both real and fake but the loss set only focus on the reconstruction of real?
And I meet an error during training.

The Training Dataset

Can you please send me the training dataset that you used to achieve the model.
I am keen to try and reproduce the experimental outcomes described in the paper.
My email is [email protected].

code for data preprocessing

Thanks for sharing the code. could you share the code for data preprocessing further? For example, use RetinaFace to extract faces from videos.

about dataset

Hello, Thanks for your work very much, can you share the processed datasets(including FF++, Celeb-DF) for me? my email is [email protected].

The data structure of the datasets

Thank you for your crop code. It is very helpful to me. But I have some questions about the structure of the dataset. Would you please provide the data structure of the dataset? The following picture is the dataset structure of F3-NET. Is this the same with the dataset structure of RECCE framework?

关于pickle文件

您好，非常抱歉打扰您，我想请问一下pickle文件的一个结构，具体是怎么存储数据的呢，方便提供一下吗，我在训练的时候这个文件一直没有处理好。非常感谢您！

Hello, how can I get the reconstruction visualization in your paper

data pre-processing and dataset initialization

Hi,

thanks for your sharing the codes! I have some questions about the data pre-processing and dataset initialization: 1. in the FaceForensics class in faceforensics.py, what is the '.pickle' file? and would you please provide this file? 2. how many frames per video do you extracted for detecting the face and further for training? especially how to balance the data number between the real videos and fake videos?(that is, extract the same number of frames first and then balance them in the training or extract the different number of frames at the begining for data balance?)

thanks~!

Unexpected performance drop when not using provided test code.

Reproducing with a bit more data (20 frames per video), I got a fairly good checkpoint. However, when I evaluated the checkpoint without the provided test code, the performance dropped unexpected. I wonder if I did something wrong.
Here's what I did:

Load my test data
Perform augmentation: images normalized with mean an std both [0.5, 0.5, 0.5]; images resized to (299, 299)
Labels: Real=0, Fake=1
Put all data to dataloader
Feed all data from dataloader to the model and calculate the ACC and AUC.

Please let me know if I am missing any important step.
And it will be very helpful if a well-trained checkpoint could be provided.

dataset problem

Thx for this great work!! I wonder if you can share the link to download DFDC dataset.
I know this problem seems independent to the method. When I register the accout in https://dfdc.ai/sign-up, it always appears that "Exceeded daily email limit for the operation or the account. If a higher limit is required, please configure your user pool to use your own Amazon SES configuration for sending email." So I can't download the dataset from the offical website : (

pretrained model

Very interested in your work! How to train my own dataset or can you provide some pre training models on ff++ or wilddeepfake? thanks a lot

about testing performance with pretrained weight

Hi, thanks for your nice work! I noticed that you have provided the pre-trained weight for us for testing. I wonder where should I place them and any revision in the scripts that I need to do. Additionally, the suffix of model weight in the script is .bin, but pre-trained weight is .pickle. Is there anything difference between them? If yes, what should I do? Thanks for your nice work again and I am looking forward to your reply.

A very nice work！

The author carefully replied to my question, and I reproduced the accuracy of the paper!

about dataset of Celeb-DF , WildDeepfake and DFDC

Hello,Thanks for your work very much. Can you spare the processed dataset of Celeb-DF, DFDC and WildDeepfake to me? Best wishes!

处理过的FF++数据集能发给我一份吗？

作者您好，非常感谢您的工作，此工作对我帮助很大，您能发一份处理过的FF++数据集给我吗？
万分感谢！！！
我的email：[email protected]

FF++c40 pretrained model

Can you provide the weight pretrained in the FF++c40, thank you very much!

The data structure of the datasets--How do I turn multiple folders of images into pickle files

(RE) root@6fb5978ad687:/RECCE# CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 --master_port 12345 train.py
/root/miniconda3/envs/xiao_db2/lib/python3.8/site-packages/torch/distributed/launch.py:180: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

warnings.warn(
Using debug mode: False.

Loading dataset: 'FaceForensics'...
Loading data from 'FF++ all' of split 'train' and compression 'c40'
Please wait patiently...
Traceback (most recent call last):
File "train.py", line 32, in
trainer = ExpMultiGpuTrainer(config, stage="Train")
File "/data1/ShivamShrirao/RECCE/trainer/exp_mgpu_trainer.py", line 29, in init
super(ExpMultiGpuTrainer, self).init(config, stage)
File "/data1/ShivamShrirao/RECCE/trainer/abstract_trainer.py", line 38, in init
self._train_settings(model_cfg, data_cfg, config_cfg)
File "/data1/ShivamShrirao/RECCE/trainer/exp_mgpu_trainer.py", line 62, in _train_settings
self.train_set = load_dataset(name)(train_options)
File "/data1/ShivamShrirao/RECCE/dataset/faceforensics.py", line 36, in init
indices = torch.load(indices)
File "/root/miniconda3/envs/xiao_db2/lib/python3.8/site-packages/torch/serialization.py", line 771, in load
with _open_file_like(f, 'rb') as opened_file:
File "/root/miniconda3/envs/xiao_db2/lib/python3.8/site-packages/torch/serialization.py", line 270, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/root/miniconda3/envs/xiao_db2/lib/python3.8/site-packages/torch/serialization.py", line 251, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'path/to/data/train_c40.pickle'

about FF++ datasets

Hello, Thanks for your work very much, can you share the processed FF++ datasets for me? my email is [email protected]

dataset of pictures

can.you provide the script of process picture from videos

test results on ff++

Hi, thanks for your excellent work!
I follow your face crop code and use the first 50 frames of each video, and then I test the ff++_c23 and c40 with the trained weight you provided, the results are lower:
c23:

Test, FINAL LOSS 0.2105, FINAL ACC 0.9169, FINAL AUC 0.9860
# EER: 0.0605(thresh: 0.2734)

c40:

Test, FINAL LOSS 0.4998, FINAL ACC 0.8063, FINAL AUC 0.8546
# EER: 0.2227(thresh: 0.7976)

I also try to use 50 frames of the first 150 frames to test, the results are similar.
Can you help me find out what's wrong? Thanks for your nice work again and I am looking forward to your reply.

关于处理过的FF++和celeb数据集

您好，我是您文章的读者，现在需要您处理后的数据集，我的邮箱是[email protected]。谢谢您！！

Can u share preprocessed dataset?

I could not reproduce the preprocessing of the dataset.
Can u share it ?

My email was sent to @XJay18 (email starts with ***)

cc: @chaoma99

处理后的数据集获取

作者您好，目前我正在研究deepfake detection相关工作，对您的工作很感兴趣，能否提供一份处理后的FF++、DFDC、Celeb-DF数据集以供研究？我的邮箱是[email protected]，感谢！

reported AUC

Hi
I have a question about Your results in tabel 4.
when you are reporting AUC, does your test datsets contain real images?
Becaus in terms of AUC we need to positive and negative samples.

Run on multiple GPUs

Training Tricks

Thank you for sharing your great job! I trained on my own preprocessed FF++ dataset, but the results are unsatisfactory. Both c23 and c40 are about 3-4 precentage points lower than the valued provided in the paper. Can you sharing some tricks during training process.

About wilddeepfake dataset

Thank you for your crop code. It is very helpful to me. But the wilddeepfake team seems to be unreachable. Could you please share a link to find a data set to download? Google cloud disk or Baidu cloud disk can be. Thanks a million.

About the link of celeb-DF.

Hi,Do you still keep the preprocessed celeb-DF dataset which provided by the link above Thank you

Originally posted by @SaharHusseini in #9 (comment)

Dataset downloading issues

I downloaded not the full FaceForensics++ dataset, but a 9GB dataset from Kaggle, extracted faces from the videos in the "deepfakes" directory, and placed them in the training and testing subfolders.
Does this method also work? Is it wrong?

cannot use with distributed pytorch

the id "id name i gave" already exists by one process so rest all workers stop.

关于处理后的数据集获取

您好，我对RECEE的实现也比较感兴趣，想获取您处理后的数据集进行尝试，我的工作邮箱是 [email protected]，已给您发了邮件，感谢您的帮助。

cross testing on celeb-DF

Hello,
First, thank you for your excellent work; I am testing your model using by Celeb-DF dataset.
I used your provided model (FF++ c40) to test on celeb-DF.
I cropped the celeb-DF dataset using code #1 and then used your data loader and test.py script for testing.
However, the results I am getting are different from the paper.

Can you advise me if there is something wrong?

Thank you for your help, and waiting for your answer

关于预训练或加载权重训练的问题

您好，我想询问一下本代码支不支持加载预训练权重训练的方式，或者加载某次训练生成的.bin文件继续训练的方式。如果支持，应该修改代码或命令的哪个部分？

作者您好，请问本项目支持部署在fastdeploy上吗？

作者您好，请问本项目支持部署在fastdeploy上吗？
谢谢！

Questions Regarding Model Training Requirements and Dataset

I am interested in your RECCE project and have a few questions regarding the training requirements and dataset preparation:

Minimum hardware requirements for training:
Could you please specify the least GPU that can be used for training? Is it possible to train the model using a single 3090 GPU?
Regarding the dataset preparation, I extract the images using c40. But the dataset is still very larage. Could you provide an estimate of the final size of the datasest after the extraction?