Git Product home page Git Product logo

6drepnet's People

Contributors

ahmednull avatar fabawi avatar mucunwuxian avatar osanseviero avatar pinto0309 avatar thohemp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

6drepnet's Issues

Is there a way to run this model on Apple M1 that doesn't have CUDA support?

I get the following error when I try to run the SixDRepNet() model on an image.

File "/Users/gurpreetmukker/Desktop/face_detection/face_detection/lib/python3.11/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

Thanks

cant change gpu id in demo.py

there is a cuda error when changing gpu to another id(except 0)

Traceback (most recent call last): File "demo.py", line 136, in <module> R_pred = model(img) File "/mnt/data2/head_pose_estimation/codes/6DRepNet/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1186, in _call_impl return forward_call(*input, **kwargs) File "/mnt/data2/head_pose_estimation/codes/6DRepNet/model.py", line 48, in forward return utils.compute_rotation_matrix_from_ortho6d(x) File "/mnt/data2/head_pose_estimation/codes/6DRepNet/utils.py", line 146, in compute_rotation_matrix_from_ortho6d x = normalize_vector(x_raw, use_gpu) #batch*3 File "/mnt/data2/head_pose_estimation/codes/6DRepNet/utils.py", line 119, in normalize_vector v_mag = torch.max(v_mag, torch.autograd.Variable(torch.FloatTensor([1e-8]).cuda())) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cuda:0!

Euler angle visualization

The demo file runs the camera. How to make inference on videos and images?

I want below results that was represented in your paper.
image

Thanks

Reference System

Hello, I assume this model is trained with the camera reference system is that correct ( and I suppose it is left handed, y-down)?
If this is the case, let's say I have a rotation matrix of a calibrated camera, let's call it Rc. Can I use Rc*Rd, where Rd will be the model's rotation matrix to register the head pose to the 3D space? Should I use Rw(Rc.T) instead of Rc for example?

Have you tried something similar? My world reference system is right handed y up btw.

pre-trained models

Hi,
Thanks for this amazing work. I am really interested in your work. I just want to test your network on the 2 test dataset (AFLW2000 and BIWI).
I am wondering that why you provide two .pth files (6DRepNet_300W_LP_AFLW2000.pth and 6DRepNet_300W_LP_BIWI.pth ) for each specific test data, should not we just test the network with one pretrained model for both test datasets?

I am looking forward to your response,
Thanks

High MAE when test with face detector

Hi, thank you for this great work, but I've met some trouble... I trained the model with my own data, and got MAE 4.05 on validation set. However when I test the trained model combined with Retinaface on same images(uncropped), the MAE turns to be more than 20. What's the possible reason?

Pip install failed

When I install the package: pip install SixDRepNet, it returns
Collecting SixDRepNet
Downloading SixDRepNet-0.1.1.tar.gz (23 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [6 lines of output]
Traceback (most recent call last):
File "", line 36, in
File "", line 34, in
File "/private/var/folders/5l/9fsdwp_n1td91scw10n67zwc0000gp/T/pip-install-8xwcmbp1/sixdrepnet_2b06c43f46c5428d9c99677633d23b6e/setup.py", line 23, in
long_description="".join(open("README.MD", "r").readlines()),
FileNotFoundError: [Errno 2] No such file or directory: 'README.MD'
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Preprocess part in train.py code and demo.py are different

Thanks for your good job.

I try to test and train the 6DRepNet model, and find some issue.

  1. Preprocess code in train.py
    normalize = transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225])

    transformations = transforms.Compose([transforms.Resize(240),
                                          transforms.RandomCrop(224),
                                          transforms.ToTensor(),
                                          normalize])

Preprocess code in demo.py

                img = frame[y_min:y_max,x_min:x_max]
               # cv2.imshow("crop", img)
               # cv2.waitKey(5)
                img = cv2.resize(img, (244, 244))/255.0
                img = img.transpose(2, 0, 1)
                img = torch.from_numpy(img).type(torch.FloatTensor)
                img = torch.Tensor(img).cuda(gpu)

normalize and input size are different.

  1. I download the pre-trained RepVGG model 'RepVGG-A0-train.pth' from here

Just use demo.py code to test 9 faces with one image, output are wrong.
9 faces have same yaw, row and pitch valudes.

and I also test the Fine-trained models from here, the pose values look well.

So what are the difference between pre-trained RepVGG model and Fine-trained models?

Query regarding face pose axis visualisation

I see that to construct rotation matrix(R) from yaw, pitch and roll angle values, you use zyx order i.e Rz * Ry * Rx,
where Rz is rotation about z-axis, Ry is rotation about the y-axis, and Rx is rotation about the x-axis.

But for visualisation, it looks like the order you use is xyz i.e Rx * Ry * Rz and then use column vectors of this resulting matrix as axis coordinates (https://github.com/thohemp/6DRepNet/blob/master/utils.py#L54). May I know why this is done? Am I missing something?

Thanks.

Where is the face detection model in test.py?

# Import SixDRepNet
from sixdrepnet import SixDRepNet
import cv2

# Create model
# Weights are automatically downloaded
model = SixDRepNet()

img = cv2.imread('/path/to/image.jpg')

pitch, yaw, roll = model.predict(img)

model.draw_axis(img, yaw, pitch, roll)

cv2.imshow("test_window", img)
cv2.waitKey(0)

Does this code measure the roll, pitch, and yaw of the entire image without a face detection model?

Will it be more accurate if I add a cropped image of the face to the img?

problem while running the training code

I have the following problem while running the training code, can you help me?Thanks very much!
Traceback (most recent call last):
File "/home/zelong/D/testdemo/6DRepNet/train.py", line 112, in
model = SixDRepNet(backbone_name='RepVGG-B1g2',
File "/home/zelong/D/testdemo/6DRepNet/model.py", line 19, in init
checkpoint = torch.load(backbone_file)
File "/home/zelong/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 594, in load
with _open_file_like(f, 'rb') as opened_file:
File "/home/zelong/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/zelong/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'RepVGG-B1g2-train.pth'

Finetuning the model

I am training your model on my own data. It arrived in 19 epochs. I would like to re-run the code to continue training, but I got this error:

Traceback (most recent call last):
  File "train.py", line 124, in <module>
    model = SixDRepNet(backbone_name,
  File "/home/redhwan/2/HPE/RosNet/sixdrepnet/model.py", line 22, in __init__
    backbone.load_state_dict(ckpt)
  File "/home/redhwan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for RepVGG:
	Missing key(s) in state_dict: "stage0.se.down.weight", "stage0.se.down.bias", "stage0.se.up.weight", "stage0.se.up.bias", "stage0.rbr_reparam.weight",....

I replaced it RepVGG-B1g2-train.pth by 300W_LP_epoch_19.pth after converting. So, I change deploy=True,

backbone_name = 'RepVGG-B1g2' 
  # backbone_file = 'RepVGG-D2se-200epochs-train.pth'
  backbone_file = '300W_LP_epoch_19.pth'
  model = SixDRepNet(backbone_name,
                     backbone_file,
                     deploy=True,
                     pretrained=True)

Your help, please.

Pretrained weight cannot be downloaded

Hi there,

I am experimenting with the SixDRepNet_Detector and I am experiencing the issue that the model pretrained weight cannot be downloaded

here is the error message on Google Colab:

image

Thank you!

Using your own trained weight files, the test results lose a lot

Hello, first of all, thank you very much for your great work, but now I have some problems. I used the pre-training model you provided for testing, which can reproduce the results in the paper, but I trained myself to get the weights, and the test result is 10 times that of the original model (Yaw: 35.6810, Pitch: 42.4646, Roll: 24.0692, MAE: 34.0716), my data set use is the same as yours, and other parameters have not changed, may I ask what is the reason for this? Could you please answer it, thank you。
@thohemp

BIWI Dataset

Hey,
I can not reproduce Your results on the BIWI Dataset.
Im comparing X Y Z angles obtained from the ground truth rotation matrix transformed by extrinsic calibration with -Pitch, Yaw and Roll repsectively.
Im using Your pip package. I crop face with Retina Face detector as You do in demo.py and pass it to the model.predict() function. I instantiate model without any parameters, so the path to the weights are default.
I have spotted one difference. In the Readme You wirte
The BIWI datasets needs be preprocessed by a face detector to cut out the faces from the images. You can use the script provided [here](https://github.com/shamangary/FSA-Net/blob/master/data/TYY_create_db_biwi.py). For 7:3 splitting of the BIWI dataset you can use the equivalent script [here](https://github.com/shamangary/FSA-Net/blob/master/data/TYY_create_db_biwi_70_30.py). We set the cropped image size to 256.
however, in the model.predict() the crop is resized to 244 (which i believe is longer edge of the picture, the shorter is then scaled with an appropriate ratio). Is it desired?

I can not find more differences, but mean error is about 25 on X and Y and about 8 on Z.
Can You help me figure it out?

Best,
Jan

Licence for Fine-tuned models

Hi.
Thanks for this interesting and wonderful piece of work!

I have a question about licensing, as the title says...
What would be the licence for Fine-tuned models?
Is it MIT like the codes, or is it different?

I want to use it as part of a work study, but I am not skilled in machine learning and would like to use the model as is!

I don't intend to publish, redistribute or incorporate them into products, but even if it is for research purposes, under my work rules, it is still a commercial use.
So, I would like to ask you for more information about the lisence for the models.

I look forward to response from you.
Thank you.

Why did you set up the scheduler as MultiStepLR=False?

I am not an expert on a scheduler, but I tried to understand it through this one.

Why did you set up the scheduler as MultiStepLR=False?

I guess that is meaningless if we set up the scheduler as MultiStepLR=False.

Please, explain to me if I understood it wrong.

Questions regarding Learning full rotation appearance

Hello there!!

Thank you very much for your great work. I am really interested in your work and would like to implement it on images with full rotation appearance.

I have two questions regarding this.
1.) Is your pre-trained model trained on full-rotation-appearance datasets (-180, 180) and capable of predicting head poses on images in which faces cannot be seen?
2.) If the answer to my first question is NO, could you please guide me on which datasets I should use for finetuning the pre-trained model to learn full orientation appearance?

Thank you very much in advance for your consideration

gap with results in papers

Hi,
Thanks for your impressive paper and code. I tried this repo to reproduce this performance, I followed all instruction and trained on300w-lp use train.py without change any parameters, then evaluate on AFLW2000 using test.py and results as below:

me: Yaw: 3.9897, Pitch: 5.0923, Roll: 3.6405, MAE: 4.2408
yours: Yaw: 3.63 , Pitch: 4.91 Roll: 3.37 , MAE: 3.97

Is there any other tricks or changes should be apply for reproduce your results?

RGB inputs or BGR inputs for model.predict(img)?

First, thanks for this great work and for sharing your code.

In the running example,

img = cv2.imread('/path/to/image.jpg')
pitch, yaw, roll = model.predict(img)

the image is loaded as a BGR numpy array, as it is the default mode of OpenCV. However, I think the model training has been done using RGB numpy arrays as the images were opened using PIL.Image.Open. Thus, I am wondering if we should convert the BGR arrays into RGB arrays before using them as input for the model.

Would you mind clarifying this?

Model convert

I want to use libtorch to infer, it seems that the '.pt' format is a must for C++, how to convert '.pth' to '.pt'?

How to train pre-train model

Hi,
I tried to train the model from scratch, seems hard to train comparable performance as training model from the pre-train model.
My question is how to train the pre-train model or how to train the similar performance from scratch?

Thanks!

Import error with pip package

pip3 install sixdrepnet   #Works!
import sixdrepnet

I get the following error (I am currently running it on colab)

[/usr/local/lib/python3.8/dist-packages/sixdrepnet/regressor.py](https://localhost:8080/#) in <module>
      8 import numpy as np
      9 
---> 10 from model import SixDRepNet
     11 import utils
     12 

ModuleNotFoundError: No module named 'model'

is the pretrained model for only faces?

Thanks for the work! Is the pretrained model is for only face pictures? If so, is there any other pretrained model for other objects, like box, bottle, shoe etc.?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.