hongguliu / deepfake-detection Goto Github PK

View Code? Open in Web Editor NEW

288.0 7.0 56.0 129.93 MB

The Pytorch implemention of Deepfake Detection based on Faceforensics++

Home Page: https://github.com/ondyari/FaceForensics

License: Apache License 2.0

Python 100.00%

deepfake-detection's Introduction

Deepfake-Detection

The Pytorch implemention of Deepfake Detection based on Faceforensics++

The Backbone net is XceptionNet, and we also reproduced the MesoNet with pytorch version, and you can use the mesonet network in this project.

Install & Requirements

The code has been tested on pytorch=1.3.1 and python 3.6, please refer to requirements.txt for more details.

To install the python packages

python -m pip install -r requirements.txt

Although you can install all dependencies at a time. But it is easy to install dlib via conda install -c conda-forge dlib

Dataset

If you want to use the opensource dataset Faceforensics++, you can use the script './download-FaceForensics_v3.py' to download the dataset accroding the instructions of download section.

You can train the model with full images, but we suggest you take only face region as input.

Pretrained Model

The model provided just be used to test the effectiveness of our code. We suggest you train you own models based on your dataset.

And we will upload models which have better performance as soon as possible.

we provide some pretrained model based on FaceForensics++

FF++_c23.pth
FF++_c40.pth

Usage

To test with videos

python detect_from_video.py --video_path ./videos/003_000.mp4 --model_path ./pretrained_model/df_c0_best.pkl -o ./output --cuda

To test with images

python test_CNN.py -bz 32 --test_list ./data_list/Deepfakes_c0_299.txt --model_path ./pretrained_model/df_c0_best.pkl

To train a model

python train_CNN.py (Please set the arguments after read the code)

About

If our project is helpful to you, we hope you can star and fork it. If there are any questions and suggestions, please feel free to contact us.

Thanks for your support.

License

The provided implementation is strictly for academic purposes only. Should you be interested in using our technology for any commercial use, please feel free to contact us.

deepfake-detection's People

Contributors

Stargazers

Watchers

deepfake-detection's Issues

about the pretained model

what is the model of ffpp_c23.pth ？
xception or meso4?

When I run train_CNN.py file.. it requires some sort of text files? Can you tell me how to solve this problem?
2. I converted videos into images.. but I got some issues on how to label those images that I extracted from videos?

项目文献

作者您好，由于现在项目实现的需要，不知您可否为我们提供一下本项目检测方法的论文文献作为参考，十分感谢

请问作者这是基于什么来检测假视频的？

你好，请问数据集有多大啊

models differences

what is the difference between:
ffpp_c40.pth
ffpp_c23.pth
deepfake_c0_xception.pkl

All of them are exceptionNet?
The difference is the dataset they trained on?

Cannot connect to X server

Hey, I have a small issue.
Im trying to reproduce the experiment with the given Model C23.
I installed everything correct in an anaconda virtual env and I can run the detect_from_video.py but as soon as it loads in the model and the video I get this issue:
0%| | 0/396 [00:00<?, ?it/s]: cannot connect to X server

Best regards,
Jan

训练好的权重，做测试很准，但是测试视频的时候很差

some details of training ?

hi , i got a bad trainning result by my own way: I have some question as below:
0、 I save 1 image per 5 images every mp4， It is OK?
1、why choose size 299 if I dont resize 299 , will get a bad result?
2、i only use the HQ data and when i was training ,combine the youtube as real and all the maniplated seq as fake; is it ok?

looking forward your answer;
thanks alot

My DeepFake

关于利用Xception训练时数据集大小的疑问？

honggu，您好。我最近在利用Xception训练deefake，其中我遇到了一些问题：我的训练精度非常高，但是validation和test的acc却很低或者不变。起初我以为是我Dataloader部分的代码写错了，但是我将train dataset作为validation，却能够在每个epoch下acc能够提升。先声明一下，我采用的不是FF++的数据集和Kaggle上DFDC的full数据集（太大了），而是用的Kaggle上给的sample dataset（大概400个训练视频，400个测试视频），并且在提取人脸后也做了样本平衡的操作。所以，我想问一下经验丰富的您，是否是我采用的数据集太小而导致的问题，是否必须采用full dataset才能够在validation和test中看到一些效果？

关于train loss下降，而valid loss上升的问题

谢谢您的开源工作！
我在复现过程中参考了您的代码。关于数据集的设置，train:valid:test的划分是700:150:150，每个video随机抽取100帧。
但是在二分类的过程中，出现了train loss下降，valid loss上升的问题。
请问这样的数据抽取方式是导致问题的原因吗？还是说在抽取数据集时有其他的trick？
我认为训练集与验证集数据分布不同可能会导致这种情况，是这个样子吗？如果是，应该如何修正呢？

Test the video with your own training model

Hello, I trained your pre-training model with the pictures extracted from my 6000+6000 videos, and got a high accuracy rate, but I used the trained model to test the real video, almost all the pictures in the frame The display is a fake picture. Which step has the problem occurred?

About training.

How many times do you train to get the best results?
Can I use the old pre-trained model to continue train new models?

Some problem of training

Your project is useful! And I try to train a model by my own dataset. But I got some problem.

First my dataset include fake image and real image both of them are labeled. Is that right?
After that I prepare some test data and all of them are not in training data.
When I start training, my loss is always 0 and accuracy is always 1. And when I test video, the anser is always FALSE.

May I ask you why this happened? Thank you so much!

paper

Do you have a paper? Can you provide it?

about training dataset

How are the faces for training obtained? The mask provided by the official or extracted by using the existing model(retinaFace or others).

How do you sample the frames in the video?

在我的数据集上test，准确率在78%

您好，我搜索到您是**人，所以我用汉语提问了。
我用您提供的代码test了我的数据集，我用的pretrained_model是ffpp_c23.pth，准确率是78%，我用deepfake_c0_xception.pkl，准确率是73%。这是最高的准确率了。
我尝试冻结模型的前十个block，在deepfake_c0_xception.pkl上继续训练，训练集达到了1.但是测试集是20%多，我也尝试了重新训练，依然是训练集准确率1，测试集准确率很低。
我的训练集是146个真实视频和146个伪造视频（我自己的伪造方法）的取帧操作得到的各29万张图片（检测人脸等操作后），
我的测试集是我用伪造方法又伪造的新视频。
请问您能帮助我吗？给我一个可行的改进方案可以吗

./data_list/FaceSwap_c0_train.txt && /data_list/FaceSwap_c0_val.txt

May i know where can i find these txt files or how can i create these files? Thank you. :)

Some confusion on the training result.

Hi, I've tried to train my own xception model on LQ data of FF++ dataset and find that the model performs so unbalanced.
The accuracy is about 50% on real images while near 100% on fake images.
I've consider the ratio of the number of fake and real images (4:1), so I only choose a quarter of fake images to get a balance. But it doesn't work. The accuracy on real images is still below 60%.
Here's some training details:

Use only c40 data. 720 for train, 140 for validate, 140 for test.
Capture 1 frame per 10 frames each video.
Train on fake data generated by all four methods (DF/F2F/FS/NT)

The result is so confused and I hope that you could give me some advices.
Thanks a lot.

Minor Issue: Spelling mistake on README.md

Hi!

Just wanted to let you know there is a mistake on the pip install of the requirements.txt command where requirements.txt is misspelled.

Not a big issue, but someone may copy it directly in their command prompt and get an error.

multiple-gpu case

@HongguLiu
have you tested your training code with multiple gpu? I got the RuntimeError: NCCL Error 2: unhandled system error. One gpu case is fine for me. Thanks.

How to download images for training?

The text file you mentioned has a list of image paths and labels. However, the FaceForensics++ dataset only has videos. How do you obtain these images?

'test_CNN.py' prediction results always get '1'

hello @HongguLiu
I try 'test_CNN.py' to predict images. But the prediction results always get '1'.
I wonder there is any preprocessing of input images or other preprocessing?
Thank you.

Do i need cuda to run test_CNN.py?

I have installed all the requirements but i am running CPU insteaad of GPU and i cant seem to get test_CNN.py to run.

About pretrain_model

Hi! When I use the pre-training model(c40) provided to test on the FF++ dataset directly, I find the acc and auc is about 0.70 which is much lower than the original paper. I use original xception as backbone just change the last FC layer to verify above experiments. So, could you please point out what's wrong with me or some other problems?
Thanks!

about dataset

I'm reproducing the results, but I can't download the data set of faceforensics + +. The running error says that I don't have access to the download website. So, could you please give me the downloaded dataset. I only need 4 kinds of manipulated data and original unmanipulated data under C23 and C40. could you send it to me after compression? I need your help, really.
my e-mail : [email protected]
Thank you.

作者能不能提供一份该项目实现背后的原理来帮助理解代码

目前可以运行成功代码，但是想着作者大大能不能提供项目的代码实现原理说明来帮助理解代码。