Git Product home page Git Product logo

Comments (6)

fcole avatar fcole commented on July 27, 2024

Hi, yes, the code could be more easy to understand, sorry about that. In order to run the full model, you basically need to fill in the dictionary of values specified here:

return {'img': img,

these correspond to the various buffers mentioned in loss definitions in the paper. You don't need to create your own HDF5s etc. as long as you can create a dictionary including those buffers. Hope that helps.

from mannequinchallenge.

rmbashirov avatar rmbashirov commented on July 27, 2024

How to I get gt_depth, human_mask, flow, keypoints_img in your format on my own data so that I can inference your full model?

from mannequinchallenge.

rmbashirov avatar rmbashirov commented on July 27, 2024

Ok, I realised that providing full pipeline for inferring full model on any data is almost impossible for you.

Can you provide inference result of your full model for MC dataset?

from mannequinchallenge.

fcole avatar fcole commented on July 27, 2024

Unfortunately, we don't have permission to share image-like-results (e.g., depth buffers) from the MC dataset. Sorry about that.

For inference, you shouldn't need the gt_depth, and the model with keypoints input performs only marginally better than the model without, so the only things you really need are flow and the human mask.

from mannequinchallenge.

Tetsujinfr avatar Tetsujinfr commented on July 27, 2024

Thanks for your indications above. So I looked at the load_tum_hdf5 function. I have a few questions.

A) the code reads 11 objects:

 - img_1: is it a simple 24bits RGB numpy matrix of the image we are trying to infer?

 - gt_depth: I can ignore it since I just want to infer, but can I just comment this piece of code and all downstream reference to the treatment of this object or do I need to fake a dummy input?

 - lr_error: what is that? Can I ignore the same way as gt_depth? It looks like it is used to comput the confidence map which seems a key input to your model no?

 - human_mask: I assume this is a binary mask of the same size as img_1, right? What is the format expected, 0.0=transparent and 1.0 equals opaque, i.e. the mask shape? (Rgb black/white image I assume?)

 - angle_prior: what is that? Is it the second image? It looks like it is used to comput the confidence map which seems a key input to your model no?

 - pp_depth: what is that? It looks like it is used to comput the confidence map which seems a key input to your model no?

 - flow: the output of FlowNet2 I assume but is it a 24bits rgb image or is it the raw flow data structure of the .flo object of FlowNet? Does it need to have the exact same height × width size as img_1 ?

 - T_1_G: what is that? It looks like it is used to comput the confidence map which seems a key input to your model no?

 - T_2_G: same as for T_1_G

 -  intrinsic: same as for T_1_G

 - keypoints_img: can I just input a keypoint image from OpenPose for instance? Do the points need to be single pixels? Is there a particular colouring scheme for each point which needs to be followed or I can just pass the colouring of OpenPose?

Thanks a lot for your guidance on this.

from mannequinchallenge.

zhengqili avatar zhengqili commented on July 27, 2024

Hi, I am the first author of this paper.

img_1: should be RGB image between 0 and 1
lr_error: is the left-right consistency error corresponding to C_lr in Eq.5 of supplementary material: http://www.cs.cornell.edu/~zl548/images/mannequin_depth_cvpr2019_supp_doc.pdf
human_mask: is the binary mask, where 1 indicates human, 0 indicates background.
angle_prior: is C_pa in Eq.5 of supplementary material.
pp_depth: depth from motion parallax using P+P representation in Eq. 4 of supplementary material.
T_1_G: this is the 4X4 homogenous transformation matrix from global to reference image described in the paper.
T_2_G: this is the 4X4 homogenous transformation matrix from global to source image in the paper.
intrinsic: 3X3 intrinsic matrix
keypoints_img: You can use any keypoint detection algorithm you want but you have to normalize their index based on MaskRCNN. In particular, what you need to do is from https://github.com/roytseng-tw/Detectron.pytorch/blob/master/lib/utils/vis.py, you can see line 198-199:
i1 = kp_lines[l][0]
i2 = kp_lines[l][1],

you need to normalize it by using following code:
final_i1_value = (i1 + 1.0)/18.0
final_i2_value = (i2 + 1.0)/18.0

Please send me email ([email protected]) for more questions since I seldom reply in Github.

from mannequinchallenge.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.