Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Clarification - Inference about machine-learning-collection HOT 1 CLOSED

aladdinpersson commented on May 20, 2024

Clarification - Inference

from machine-learning-collection.

aladdinpersson commented on May 20, 2024

I will try my best!

How the sequence is produced at training time
So at training time we have the entire input and entire target sentences and all we have to do is to: Tokenize --> Numericalize --> Pad (so all are of equal length in the batch). I have separate videos where I go into more details on the data loading part and you could check out the torchtext videos for that. But after that both of these are inputted to the transformer and we utilize masking so that the network doesn't cheat by looking ahead in the target sentence (I've also gone into more depth on this in the transformer from scratch video).
How the sequence is produced at test time
Obviously at test time we don't have the entire target sentence but we have the input sentence, and what we do is that we try to output a single word at a time (that's we have for i in range(max_length)) loop in translate_sentence function. In the beginning we only have a start token for the target, but for each iteration in the for loop we gain one additional output predicted from the model (we take the highest probability prediction and append it to our outputs). We continue doing this in the for loop until we either a) reach a EOS token, or b) continue until max_length is reached.

Hopefully that clarifies a little bit :)

/Aladdin

from machine-learning-collection.

Recommend Projects