kehua1116 / megatron-deepspeed-llama Goto Github PK
View Code? Open in Web Editor NEWThis project forked from lydiaxiaohongli/megatron-deepspeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
License: Other