Git Product home page Git Product logo

Comments (2)

ghenter avatar ghenter commented on August 16, 2024

Dear @nlpofwhat,

I have read your anther paper of Moglow (...) It seems to be a autogressive model totally with locomotion?

The StyleGestures paper builds on our earlier MoGlow preprint, which has been accepted for presentation at SIGGRAPH Asia this year. Since the architectures are relatively similar in both works, I would personally argue that both of them are examples of "MoGlow"("motion Glow") architectures, with the main difference being the application: The original MoGlow paper preprint demonstrates that the architecture works for different locomotion-generation tasks, while the StyleGestures follow-up work shows that the architecture can be adapted to gesture synthesis and also adds style control.

So, in summary: My view is that MoGlow is an architecture or a method; locomotion synthesis and speech-driven gesture generation are applications for which we have demonstrated that the method gives good results. MoGlow is thus not an "autogressive model totally with locomotion".

I try to reproduce the results but I can't figure out what the original input data is. For example, you generate gestures by audio then the main input data is audio.

The original MoGlow paper investigates path-based locomotion control from motion-capture data of humans and dogs. For each frame, the main input (the control signal) contains 3 numbers that specify:

  1. The current forward velocity of the character in a body-centric coordinate system.
  2. The current sideways velocity of the character in a body-centric coordinate system.
  3. The current turning velocity of the character.

Together, these 3 numbers act to steer the moving character a bit like driving a car. Over time, the control signal defines a path through space that the root node will follow, along with the character's facing direction at different times. The task of the probabilistic model is then to generate a sequence of poses for the character that represent convincing locomotion along that path.

Like you suggest, this is a very different problem than generating gesture motion using speech acoustics as input features. We think it is a strength of the method that the same basic approach gives such compelling results on such disparate problems.

I can't figure out what the original input data is

If your question is about data availability and coding (e.g., which numbers represent what), the pre-processed locomotion data we used in the paper can be downloaded from: https://kth.box.com/s/quh3rwwl2hedwo32cdg1kq7pff04fjdf. (Edited in 2022: Our university is no longer using Box, invalidating the link originally provided here. Please see the repository readme for the latest files.) Code for training the MoGlow locomotion systems from the article is available in this repository, but may be split off to its own GitHub repo in the near future. Additional information regarding how to use the locomotion training data can be found in issue #1, for instance in this comment as well as this comment and its response. @simonalexanderson knows more about the details of the code and might be able to follow up with any information that I may have missed.

I hope this answers your questions.

from stylegestures.

nlpofwhat avatar nlpofwhat commented on August 16, 2024

I've got my answers. Thank you for answering my questions so patiently.

from stylegestures.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.