In this repository I'll explore various transformer architectures, starting by the original transformer which I have already implement in the attention-is-all-you-need.ipynb
notebook, and tested a simple training pipeline for this architecture in the training.ipynb
notebook.
This repo is currently a work in progress.