โ๏ธ Take a look โ chess-transformers.
๐ค I develop AI models.
๐ I usually work with Python and PyTorch.
Attention Is All You Need | a PyTorch Tutorial to Transformers
License: MIT License
โ๏ธ Take a look โ chess-transformers.
๐ค I develop AI models.
๐ I usually work with Python and PyTorch.
Hi @sgrvinod
in the xe train function:
predicted_sequences = model(source_sequences, target_sequences, source_sequence_lengths, target_sequence_lengths) # (N, max_target_sequence_pad_length_this_batch, vocab_size)
The target_sequence_lengths
still includes the lengths with the <end>
token, and in this case in MultiHead Attention it will be attending over the <end>
token.
I think it should be: target_sequence_lengths - 1
predicted_sequences = model(source_sequences, target_sequences, source_sequence_lengths, target_sequence_lengths - 1) # (N, max_target_sequence_pad_length_this_batch, vocab_size)
Please clarify
i got this error when i train a model (srgan)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/drive/My Drive/PyTorch--master/datasets.py", line 67, in getitem
img = Image.open(self.images[i], mode='r')
KeyError: 0
i'm training the model in google colab
thanks
Hi @sgrvinod
Thank you for your Tutorial posted for Attention is all you need. I have a small question, and would appreciate an answer.
In data loader.py you've grouped the batches according to their lengths, so that a batch has similar lengths. Is that necessary to be done? I do understand that it speeds up the training and reduces memory. But my question is does it have any effect on the performance if I don't group the data according to the lengths?
Thanks
Thanks for your tutotial of Attention is all you need, and I have a small question.I would really appreciate for an answer.Because the project only has datasets of training, why we need a val_loader in train.py.Should I download data for val set by myself๏ผ
val_loader = SequenceLoader( data_folder=data_folder, source_suffix='en', target_suffix='de', split='val', tokens_in_batch=tokens_in_batch )
I dont know why youtokentome cannot be imported even when doing pip install youtokentome it is prompting an error
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.