Git Product home page Git Product logo

apvit's People

Contributors

youqingxiaozhua avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

apvit's Issues

Attention visualization

I want to get attention visualization of different expressions like whitch shown in your paper Transfer, can you provide some tools? I am new to Paddle. It's a bit difficult to convert the code from pytorch to paddle.

Some questions about the experimental environment

Hello author, thank you very much for your outstanding contribution. I would like to ask if it is possible to make an image of the environment you are using and upload it to dockerhub? This will help my work more, if you can, I will be very grateful.

missing files

I can't figure out where to get the 'ir.pdparams' and 'vit.pdparams' files in order to rebuild the model

datasets

Are there any other data sets that have been preprocessed??Thank you very much

miss function

The function 'compute_rollout_attention' in trans_fer.py is not defined. It is referenced in the function VisionTransformer.relprop().
"/APViT/Paddle/ppcls/arch/backbone/model_zoo/trans_fer.py"

The preprocessed FERPlus

Thanks for your great work! I run this model on RAF-DB you provided and got 90.8% accuracy. But only 85.5% accuracy was obtained on the official FERPlus dataset. I wonder if you provide FERPlus that aligned by MTCNN or the codes for FERPlus preprocessing? I think it's important to reproduce your result on FERPlus. Thank you again!

On the Environmental Issues of Experimental Operation

Hello, how can I run your code after I download it again? Do you need to fully install the installation prompts provided by the "MMClassification" and "PaddleClas" tool boxes? Do you need to install the folder 'mm2'? When I run 'Python - m torch. distributed. launch -- nproc_per_node=2 train. py configs/apvit/RAF. py -- launcher pytorch', there are errors: ModuleNotFoundError: No module named 'mmcv' and 'torch. distributed. final. multiprocessing. errors. ChildFailedError:' I would like to do some research based on your research and cite your paper. I hope to receive your reply.

Unable to reproduce results

Thank you for your open source contribution!

I took out the model architecture and trained it using my own training code. And, I followed the optimization function, optimizer, and optimization strategy in your code, but I did not get enough good results, with the highest accuracy of only 86%.
[train epoch 98] loss: 0.057, acc: 0.982: 100%|█████████████████████| 95/95 [01:12<00:00, 1.31it/s]
[valid epoch 98] loss: 0.564, acc: 0.863: 100%|█████████████████████| 23/23 [00:18<00:00, 1.24it/s]
[train epoch 99] loss: 0.056, acc: 0.981: 100%|█████████████████████| 95/95 [01:12<00:00, 1.32it/s]
[valid epoch 99] loss: 0.565, acc: 0.863: 100%|█████████████████████| 23/23 [00:18<00:00, 1.25it/s]

I don't know where the problem lies. And I dont understa what's that mean
image

Missing parameters and wrong predictions

Hi,
I have downloaded model pretrained weights (file APViT_RAF-3eeecf7d.pth) following the link in README and tried to run the model architecture on some sample images.

I am loading the model with this code snippet

cfg = mmcv.Config.fromfile("configs/apvit/RAF.py")
cfg.model.pretrained = None

# build the model and load checkpoint
classifier = build_classifier(cfg.model)
load_checkpoint(classifier, "pretrained/APViT_RAF-3eeecf7d.pth", map_location='cpu')
classifier = classifier.to("cuda")
classifier.eval()

but I get some warnings

unexpected key in source state_dict: 
output_layer.0.weight, output_layer.0.bias, output_layer.0.running_mean, output_layer.0.running_var, output_layer.0.num_batches_tracked, output_layer.3.weight, output_layer.3.bias, output_layer.4.weight, output_layer.4.bias, output_layer.4.running_mean, output_layer.4.running_var, output_layer.4.num_batches_tracked, body.21.shortcut_layer.0.weight, body.21.shortcut_layer.1.weight, body.21.shortcut_layer.1.bias, body.21.shortcut_layer.1.running_mean, body.21.shortcut_layer.1.running_var, body.21.shortcut_layer.1.num_batches_tracked, body.21.res_layer.0.weight, body.21.res_layer.0.bias, body.21.res_layer.0.running_mean, body.21.res_layer.0.running_var, body.21.res_layer.0.num_batches_tracked, body.21.res_layer.1.weight, body.21.res_layer.2.weight, body.21.res_layer.3.weight, body.21.res_layer.4.weight, body.21.res_layer.4.bias, body.21.res_layer.4.running_mean, body.21.res_layer.4.running_var, body.21.res_layer.4.num_batches_tracked, body.22.res_layer.0.weight, body.22.res_layer.0.bias, body.22.res_layer.0.running_mean, body.22.res_layer.0.running_var, body.22.res_layer.0.num_batches_tracked, body.22.res_layer.1.weight, body.22.res_layer.2.weight, body.22.res_layer.3.weight, body.22.res_layer.4.weight, body.22.res_layer.4.bias, body.22.res_layer.4.running_mean, body.22.res_layer.4.running_var, body.22.res_layer.4.num_batches_tracked, body.23.res_layer.0.weight, body.23.res_layer.0.bias, body.23.res_layer.0.running_mean, body.23.res_layer.0.running_var, body.23.res_layer.0.num_batches_tracked, body.23.res_layer.1.weight, body.23.res_layer.2.weight, body.23.res_layer.3.weight, body.23.res_layer.4.weight, body.23.res_layer.4.bias, body.23.res_layer.4.running_mean, body.23.res_layer.4.running_var, body.23.res_layer.4.num_batches_tracked

missing keys in source state_dict: projs.0.weight, projs.0.bias

Then I am loading some images in which I am first using MTCNN to crop around person face (to make them more similar to RAF DB) and processing with this torch transformations that should replicate the ones in the config files

test_preprocess = transforms.Compose([
                           transforms.Resize((112, 112)),
                           transforms.ToTensor(),
                           transforms.Normalize(
                                      mean=[x/255 for x in [123.675, 116.28, 103.53] ],
                                      std=[x for x in [58.395, 57.12, 57.375] ]
                           )
])

and running inference with

out = classifier(tensor_in.to("cuda"), return_loss=False)
out = [np.argmax(o) for o in out]

but what I get is always class 6 no matter the expression person has in input image.

Am I doing something wrong in either model loading or preprocessing ?

Thanks for your support

Current Environment:

  • Python 3.7.13
  • mmcls 0.25.0
  • mmcv-full 1.7.1
  • torch 1.8.2
  • torchvision 0.9.2

Require config files

Dear author. Thanks your awesome job!

I found you implement “ MASK VISION TRANSFORMER FOR FACIAL EXPRESSION RECOGNITION IN THE WILD” and “ Facial Expression Recognition with Visual Transformers and Attentional Selective Fusion” in your repository.

Could your provide those config files for me? Thanks!

validset is testset?

hi,
you set your validset as the samples of test set.
https://github.com/youqingxiaozhua/APViT/blob/main/configs/_base_/datasets/RAF.py

image

wouldnt this corrupt the measured performance on the test set since you are directly picking the model with the best performance on the testset?

usually, the validset is different and it is used to pick a model.
the best picked model on validset is used to report the performance on the testset.

thanks

The RAF-DB that aligned by MTCNN.

Excuse me, could you provide the RAF-DB that you train and test? Because I can't achieve the performance of 91.98% on the official aligned test set of RAF-DB with the weight APViT_RAF-3eeecf7d.pth. Then I try to re-align the official aligned test set by MTCNN, but it doesn't work either. I only get the top-1 accuracy: 81.42%.

import error

Hi,I get an error when i run tools/test.py.
the error is : ImportError: cannot import name 'wrap_fp16_model' from 'mmcls.core'
I can't find 'wrap_fp16_model' in 'mmcls.core'.
I would be appreciated if you could help me!

Using raf-db pretrained model for prediction

Hi, I wanted to use your RAF-DB pre-trained model directly for predicting expressions of a live camera feed input, can you please give me instructions for that. I tried to read the readme file but couldn't figure it out.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.