Git Product home page Git Product logo

Comments (14)

vitoralbiero avatar vitoralbiero commented on June 20, 2024

The code to fine-tune on 300W-LP is the same as in here, we just didn't release the annotations.
If you want to fine-tune yourself, you will need to create 300W-LP ground-truth data, load the pre-trained WIDER-FACE weights when training, and fine-tune on 300W-LP.

If you prefer, the fine-tuned pre-trained model can be found here.

from img2pose.

FunkyKoki avatar FunkyKoki commented on June 20, 2024

Sorry, you missed the link in the first sentence. I cannot see it.

from img2pose.

vitoralbiero avatar vitoralbiero commented on June 20, 2024

Edited.

from img2pose.

FunkyKoki avatar FunkyKoki commented on June 20, 2024

So would you release the annotations?

As I see, without fine-tuning, the network performance is very poor.

Performance comparison on AFLW2000:

yaw pitch roll MAE
without fine-tuning 4.541 8.322 5.586 6.150
with fine-tuning 3.426 5.034 3.278 3.913

from img2pose.

vitoralbiero avatar vitoralbiero commented on June 20, 2024

We have no plans on releasing these annotations at this time.
You can use our instructions to annotate it if you would like.

The performance only appears to be poor because of a problem with Euler angles.
This causes samples that have small qualitative and other format error to be computed as a large error.
You can see an example here.

Apart from evaluating with Euler angles, we recommend using the model without fine-tuning.

from img2pose.

FunkyKoki avatar FunkyKoki commented on June 20, 2024

Thank you so much. ❤️

from img2pose.

FunkyKoki avatar FunkyKoki commented on June 20, 2024

The performance only appears to be poor because of a problem with Euler angles.

So what if we use 'xyz' and 'zxy' to decode the output at the same time, and choose the best as the final result? (I mean, for each face.)

from img2pose.

vitoralbiero avatar vitoralbiero commented on June 20, 2024

We could do that, but to be fair with other models compared to, we only use xyz.

from img2pose.

FunkyKoki avatar FunkyKoki commented on June 20, 2024

Alright, thank you.

from img2pose.

FunkyKoki avatar FunkyKoki commented on June 20, 2024

Happy new year!

As I have tried. Even though I choose the minimum error between xyz and zxy decoding, the redults are not very well as well.

yaw pitch roll MAE
without fine-tuning 4.541 8.322 5.586 6.150
without fine-tuning (choose minimum of xyz and zxy decoding) 4.830 7.816 5.517 6.054
with fine-tuning 3.426 5.034 3.278 3.913

So, the fine-tuning plays an important role in the pose evalution.

from img2pose.

vitoralbiero avatar vitoralbiero commented on June 20, 2024

Thanks, to you too!

Have you also converted the GT to zxy? Just swapping axis won't work, and the errors will be big.
You'll need to do something like:
gt_zxy = Rotation.from_euler("xyz", pose_target[:3], degrees=True).as_euler("zxy", degrees=True)
Also make sure when converting the prediction to Euler angles in zxy format that z and y have a negative sign and x doesn't.

Regardless, the fine-tuning tries to constrain the poses learned to less than 90 degrees, that's why it performs better when tested with Euler angles.

from img2pose.

FunkyKoki avatar FunkyKoki commented on June 20, 2024

OK, I retest the model without fine-tuning, the results are as following:

yaw pitch roll MAE
without fine-tuning 4.541 8.322 5.586 6.150
without fine-tuning (choose minimum of xyz and zxy decoding) corrected 4.751 5.788 3.898 4.812
with fine-tuning 3.426 5.034 3.278 3.913

from img2pose.

FunkyKoki avatar FunkyKoki commented on June 20, 2024

I still consider this result as a big gap. 🐰

from img2pose.

vitoralbiero avatar vitoralbiero commented on June 20, 2024

If what you care most is pose evaluation in Euler angles, then sure, use the fine-tuned model.

from img2pose.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.