youqingxiaozhua / apvit Goto Github PK

View Code? Open in Web Editor NEW

38.0 38.0 9.0 743 KB

PaddlePaddle and PyTorch implementation of APViT and TransFER

License: Apache License 2.0

Python 96.56% Shell 0.13% Jupyter Notebook 3.31%

apvit's People

Contributors

Stargazers

Watchers

Forkers

ml-edu pengliu0 xiaohoua huangjh98 lovekdl ttanlocc saulocatharino cherrie-queja davidcrgh

apvit's Issues

May I ask if you can provide code for visualization similar to GradCAM?

I have tried using the vis_cam.py tool provided by mmpretrain, but it seems that it cannot run properly due to version issues.

Attention visualization

I want to get attention visualization of different expressions like whitch shown in your paper Transfer, can you provide some tools? I am new to Paddle. It's a bit difficult to convert the code from pytorch to paddle.

Some questions about the experimental environment

Hello author, thank you very much for your outstanding contribution. I would like to ask if it is possible to make an image of the environment you are using and upload it to dockerhub? This will help my work more, if you can, I will be very grateful.

missing files

I can't figure out where to get the 'ir.pdparams' and 'vit.pdparams' files in order to rebuild the model

datasets

Are there any other data sets that have been preprocessed??Thank you very much

Import "ppcls.data" could not be resolved

I try to run the test of TransFER. Then I found that there is not a 'data' in Paddle.ppcls. Maybe a folder 'data' is missing in the project

miss function

The function 'compute_rollout_attention' in trans_fer.py is not defined. It is referenced in the function VisionTransformer.relprop().
"/APViT/Paddle/ppcls/arch/backbone/model_zoo/trans_fer.py"

The preprocessed FERPlus

Thanks for your great work! I run this model on RAF-DB you provided and got 90.8% accuracy. But only 85.5% accuracy was obtained on the official FERPlus dataset. I wonder if you provide FERPlus that aligned by MTCNN or the codes for FERPlus preprocessing? I think it's important to reproduce your result on FERPlus. Thank you again!

On the Environmental Issues of Experimental Operation

Hello, how can I run your code after I download it again? Do you need to fully install the installation prompts provided by the "MMClassification" and "PaddleClas" tool boxes? Do you need to install the folder 'mm2'? When I run 'Python - m torch. distributed. launch -- nproc_per_node=2 train. py configs/apvit/RAF. py -- launcher pytorch', there are errors: ModuleNotFoundError: No module named 'mmcv' and 'torch. distributed. final. multiprocessing. errors. ChildFailedError:' I would like to do some research based on your research and cite your paper. I hope to receive your reply.

Unable to reproduce results

Thank you for your open source contribution！

I took out the model architecture and trained it using my own training code. And, I followed the optimization function, optimizer, and optimization strategy in your code, but I did not get enough good results, with the highest accuracy of only 86%.
[train epoch 98] loss: 0.057, acc: 0.982: 100%|█████████████████████| 95/95 [01:12<00:00, 1.31it/s]
[valid epoch 98] loss: 0.564, acc: 0.863: 100%|█████████████████████| 23/23 [00:18<00:00, 1.24it/s]
[train epoch 99] loss: 0.056, acc: 0.981: 100%|█████████████████████| 95/95 [01:12<00:00, 1.32it/s]
[valid epoch 99] loss: 0.565, acc: 0.863: 100%|█████████████████████| 23/23 [00:18<00:00, 1.25it/s]

I don't know where the problem lies. And I dont understa what's that mean

Missing parameters and wrong predictions

Hi,
I have downloaded model pretrained weights (file APViT_RAF-3eeecf7d.pth) following the link in README and tried to run the model architecture on some sample images.

I am loading the model with this code snippet

cfg = mmcv.Config.fromfile("configs/apvit/RAF.py")
cfg.model.pretrained = None

# build the model and load checkpoint
classifier = build_classifier(cfg.model)
load_checkpoint(classifier, "pretrained/APViT_RAF-3eeecf7d.pth", map_location='cpu')
classifier = classifier.to("cuda")
classifier.eval()

but I get some warnings

unexpected key in source state_dict: 
output_layer.0.weight, output_layer.0.bias, output_layer.0.running_mean, output_layer.0.running_var, output_layer.0.num_batches_tracked, output_layer.3.weight, output_layer.3.bias, output_layer.4.weight, output_layer.4.bias, output_layer.4.running_mean, output_layer.4.running_var, output_layer.4.num_batches_tracked, body.21.shortcut_layer.0.weight, body.21.shortcut_layer.1.weight, body.21.shortcut_layer.1.bias, body.21.shortcut_layer.1.running_mean, body.21.shortcut_layer.1.running_var, body.21.shortcut_layer.1.num_batches_tracked, body.21.res_layer.0.weight, body.21.res_layer.0.bias, body.21.res_layer.0.running_mean, body.21.res_layer.0.running_var, body.21.res_layer.0.num_batches_tracked, body.21.res_layer.1.weight, body.21.res_layer.2.weight, body.21.res_layer.3.weight, body.21.res_layer.4.weight, body.21.res_layer.4.bias, body.21.res_layer.4.running_mean, body.21.res_layer.4.running_var, body.21.res_layer.4.num_batches_tracked, body.22.res_layer.0.weight, body.22.res_layer.0.bias, body.22.res_layer.0.running_mean, body.22.res_layer.0.running_var, body.22.res_layer.0.num_batches_tracked, body.22.res_layer.1.weight, body.22.res_layer.2.weight, body.22.res_layer.3.weight, body.22.res_layer.4.weight, body.22.res_layer.4.bias, body.22.res_layer.4.running_mean, body.22.res_layer.4.running_var, body.22.res_layer.4.num_batches_tracked, body.23.res_layer.0.weight, body.23.res_layer.0.bias, body.23.res_layer.0.running_mean, body.23.res_layer.0.running_var, body.23.res_layer.0.num_batches_tracked, body.23.res_layer.1.weight, body.23.res_layer.2.weight, body.23.res_layer.3.weight, body.23.res_layer.4.weight, body.23.res_layer.4.bias, body.23.res_layer.4.running_mean, body.23.res_layer.4.running_var, body.23.res_layer.4.num_batches_tracked

missing keys in source state_dict: projs.0.weight, projs.0.bias

Then I am loading some images in which I am first using MTCNN to crop around person face (to make them more similar to RAF DB) and processing with this torch transformations that should replicate the ones in the config files

test_preprocess = transforms.Compose([
                           transforms.Resize((112, 112)),
                           transforms.ToTensor(),
                           transforms.Normalize(
                                      mean=[x/255 for x in [123.675, 116.28, 103.53] ],
                                      std=[x for x in [58.395, 57.12, 57.375] ]
                           )
])

and running inference with

out = classifier(tensor_in.to("cuda"), return_loss=False)
out = [np.argmax(o) for o in out]

but what I get is always class 6 no matter the expression person has in input image.

Am I doing something wrong in either model loading or preprocessing ?

Thanks for your support

Current Environment:

Python 3.7.13
mmcls 0.25.0
mmcv-full 1.7.1
torch 1.8.2
torchvision 0.9.2

Require config files

Dear author. Thanks your awesome job!

I found you implement “ MASK VISION TRANSFORMER FOR FACIAL EXPRESSION RECOGNITION IN THE WILD” and “ Facial Expression Recognition with Visual Transformers and Attentional Selective Fusion” in your repository.

Could your provide those config files for me? Thanks!

model architecture

I don't know how to get the model architecture

validset is testset?

hi,
you set your validset as the samples of test set.
https://github.com/youqingxiaozhua/APViT/blob/main/configs/_base_/datasets/RAF.py

APViT/configs/_base_/datasets/RAF.py

Line 57 in 6c7b576

val=dict(

wouldnt this corrupt the measured performance on the test set since you are directly picking the model with the best performance on the testset?

usually, the validset is different and it is used to pick a model.
the best picked model on validset is used to report the performance on the testset.

thanks

Not able to get the password for unzipping the RAF-DB.zip

Hi,
I am not able to get the password. As, the code you provided to get the password is showing none. Will you please provide the password to access it?

Thank you!

The RAF-DB that aligned by MTCNN.

Excuse me, could you provide the RAF-DB that you train and test? Because I can't achieve the performance of 91.98% on the official aligned test set of RAF-DB with the weight APViT_RAF-3eeecf7d.pth. Then I try to re-align the official aligned test set by MTCNN, but it doesn't work either. I only get the top-1 accuracy: 81.42%.

import error

Hi,I get an error when i run tools/test.py.
the error is : ImportError: cannot import name 'wrap_fp16_model' from 'mmcls.core'
I can't find 'wrap_fp16_model' in 'mmcls.core'.
I would be appreciated if you could help me!

Using raf-db pretrained model for prediction

Hi, I wanted to use your RAF-DB pre-trained model directly for predicting expressions of a live camera feed input, can you please give me instructions for that. I tried to read the readme file but couldn't figure it out.