๐งโ๐ป I'm currently a Lead Deep Learning Engineer at Chattermill, previously Research Engineer at Ontocord.ai
๐ I also carry out Machine Learning Research for LAION (Stability AI) on the Ezra-1 UltraCluster, LUMI and JUWELS supercomputers; previously did work for BigScience and the BLOOM evaluation
๐ I previously did my Masters in Machine Learning & A.I at Imperial College London carrying out work in natural language generation
๐ Iโm an active contributor of machine learning libraries such as Hugging Face Transformers and Gem-benchmark
๐ฌ I sometimes give talks for the NLP study group, the most popular NLP community on meetup.com
๐ญ Iโm currently working on Mixture of experts and the open-source chat agent OpenAssistant
need to check ive implemented label smoothing with how authors how they label smoothed their objective sampling as objective fn includes negative sampling.
really not clear from paper: 'computed as cosine similarity
with annealing between the encodings hx and
hy. It starts at 1 and ends atp
d, linearly increasing
over the first 10K training batches.'
implement BPE from scratch with unk tokens hashed (although may achieve worse results on downstream tasks) as # perhaps not as general as bpemb's 25000.model
relative bias addition is row additions of permutations of a subset of bias vector ,
need to find way to get rid of for loop and code this in one. definitely parralizable.