Git Product home page Git Product logo

Comments (1)

aladdinpersson avatar aladdinpersson commented on May 20, 2024

Hey @zolekode

I will try my best!

  1. How the sequence is produced at training time
    So at training time we have the entire input and entire target sentences and all we have to do is to: Tokenize --> Numericalize --> Pad (so all are of equal length in the batch). I have separate videos where I go into more details on the data loading part and you could check out the torchtext videos for that. But after that both of these are inputted to the transformer and we utilize masking so that the network doesn't cheat by looking ahead in the target sentence (I've also gone into more depth on this in the transformer from scratch video).

  2. How the sequence is produced at test time
    Obviously at test time we don't have the entire target sentence but we have the input sentence, and what we do is that we try to output a single word at a time (that's we have for i in range(max_length)) loop in translate_sentence function. In the beginning we only have a start token for the target, but for each iteration in the for loop we gain one additional output predicted from the model (we take the highest probability prediction and append it to our outputs). We continue doing this in the for loop until we either a) reach a EOS token, or b) continue until max_length is reached.

Hopefully that clarifies a little bit :)

/Aladdin

from machine-learning-collection.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.