Git Product home page Git Product logo

Comments (8)

timtensor avatar timtensor commented on May 18, 2024 1

Yes i notice much better performance and quite stable as well

from simple-hrnet.

bpeck81 avatar bpeck81 commented on May 18, 2024

One thing I notice is that this implementation of hrnet applies the model to the cropped portion of the person returned from yolo whereas, from what I can tell, the model in the original paper is applied to the entire image. Losing the background context when predicting may affect the performance.

from simple-hrnet.

timtensor avatar timtensor commented on May 18, 2024

@bpeck81 thanks for the info , actually when i tried to disable yolov3 detector , it seemed to have worse performance, even for single person detection. I think it also has some limitation on multiperson detections , i am not sure .But with a set of optimized åre trained wieghts , i could only manage detection of 2 persons with descent performance
Do you think , by changing the frame rate we can improve the performance ?

from simple-hrnet.

stefanopini avatar stefanopini commented on May 18, 2024

According to the paper, HRNet should have quite higher performance than OpenPose when trained and tested on COCO.
However, Openpose authors claim

In addition, our paper numbers are not based on the current models that have been released. We released our best model at the time but later found a better one.

therefore it may have better performance than HRNet.
In my limited experience, performance of the two networks are similar.

@bpeck81 In the HRNet paper, authors state:

This paper is interested in single-person pose estimation

and

We extend the human detection box in height or width to a fixed aspect ratio: height:width = 4:3, and then crop the box from the image, which is resized to a fixed size, 256×192 or 384×288.

and

The two-stage top-down paradigm similar as [47, 11, 72] is used: detect the person instance using a person detector, and then predict detection keypoints.

Therefore, I add a YOLOv3 detector to find person instances and then analyze them with HRNet.
With the singleperson option, the person detector is disabled and the image is directly analyzed by HRNet.

from simple-hrnet.

timtensor avatar timtensor commented on May 18, 2024

Thank you for answering the queries , As you said for multi person it does not work so good . Perhaps they will release new pretrained weights that would be better in performace. I tired on a multiperson video , i was wondering how could be differentiate , which array corresponds to which person ?
For example if there are two person , the output array is of the type (2x17x2) Then i wonder if there is an ID associated to each person , perhaps this is related as another question

from simple-hrnet.

stefanopini avatar stefanopini commented on May 18, 2024

At the moment, there is not an ID associated to each person because I didn't implement any person tracking functionality.
Therefore, the order of the output is equal to the order of yolo detections.

from simple-hrnet.

stefanopini avatar stefanopini commented on May 18, 2024

@timtensor Could you please check the performance with the latest version of the code?
I have implemented the idea proposed in #14 and, from my (limited) tests, accuracy is quite higher now in the multi-person setting.

from simple-hrnet.

timtensor avatar timtensor commented on May 18, 2024

@stefanopini i will try to test it in the coming days!

from simple-hrnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.