Git Product home page Git Product logo

Comments (2)

stefan-kohnen avatar stefan-kohnen commented on August 16, 2024 1

Hey @simonalexanderson,
Thank you for your answer. Yes, I saw Gesticulator 👍 Accumulating Gradients and Gradient Checkpointing both seem like a good idea, thanks for the hint.
I trained your model with batch size of 10 to see if this gives the same effect as with the input feature space which I extended by BERT features (the one I called "fast-forwarding").
The clear answer: No, it does not. The motion of the generated gestures look as smooth as with a batch size of 80. Of course, they most likely differ in other aspects in comparison to the training with a batch size of 80, I didn't check on these. But they definitely do not show the same behavior as with the integration of BERT features (i.e., "fast-fowarding").

from stylegestures.

simonalexanderson avatar simonalexanderson commented on August 16, 2024

Hi @stefan-kohnen ,
Adding BERT features sounds very interesting and I would love to see what the results will show. I guess you have seen our work with Gesticulator (Kucherenko et al 2020), which has a different architecture but also uses semantic features. We have not experimented so much with batch sizes, but tend to use the maximum for our gpus, which is typically 50-100. 10 seems low to me. You may experiment with accumulating gradients over several batches (search for "accumulating gradients") or checkpointing the rnn in the coupling layer: https://github.com/prigoyal/pytorch_memonger/blob/master/tutorial/Checkpointing_for_PyTorch_models.ipynb. The later should give a large memory improvement with only a few lines of code. Please share if you get any insights on how batch size effects training.

from stylegestures.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.