Comments (1)
Hey @zolekode
I will try my best!
-
How the sequence is produced at training time
So at training time we have the entire input and entire target sentences and all we have to do is to: Tokenize --> Numericalize --> Pad (so all are of equal length in the batch). I have separate videos where I go into more details on the data loading part and you could check out the torchtext videos for that. But after that both of these are inputted to the transformer and we utilize masking so that the network doesn't cheat by looking ahead in the target sentence (I've also gone into more depth on this in the transformer from scratch video). -
How the sequence is produced at test time
Obviously at test time we don't have the entire target sentence but we have the input sentence, and what we do is that we try to output a single word at a time (that's we have for i in range(max_length)) loop in translate_sentence function. In the beginning we only have a start token for the target, but for each iteration in the for loop we gain one additional output predicted from the model (we take the highest probability prediction and append it to our outputs). We continue doing this in the for loop until we either a) reach a EOS token, or b) continue until max_length is reached.
Hopefully that clarifies a little bit :)
/Aladdin
from machine-learning-collection.
Related Issues (20)
- Why aren't you transposing the input in multi-head attention? HOT 1
- YOLO ground truth width and length are not relative to image size but to S
- YOLO v1 - why using Adam as the optimizer HOT 1
- Question in self-attention from 'transformer from scratch'
- Can you add a cff so your work can be cited? HOT 1
- Error in train.py HOT 1
- add header=None to pd.read_csv
- -
- Tensor tutorial 3: Neural Networks with Sequential and Functional API Issue
- Pretrained weight for semantic segmentation
- Image Captioning gives following error: TypeError: relu(): argument 'input' (position 1) must be Tensor, not InceptionOutputs HOT 1
- Re. Height and Width of image, mask or masks should be equal. You can disable shapes check by setting a parameter is_check_shapes=False of Compose class
- Weights ESRGAN
- YOLO v1 loss
- SelfAttention bug on Scores * V HOT 1
- Pytorch/GANs /CycleGAN/generator_model.py | Test function has a minor issue.
- Issue with YOLOv3 Anchors on Scale HOT 2
- ConvBlock for Discriminator
- type error: Trainer.__init__() got an unexpected keyword argument 'auto_lr_find'
- why is z_dim=64 in simple GAN code
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from machine-learning-collection.