Stella Biderman's Projects
http://nlp.seas.harvard.edu/2018/04/03/attention.html
Code for Auditing Data Provenance in Text-Generation Models (in KDD 2019)
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
Creative Commons Licenses for Github
š„ A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
Example models using DeepSpeed
Implementation of the plagiarism-detection algorithms behind MOSS
Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch
A framework for implementing equivariant DL
My personal repo.
Fun stuff with fractal machine learning
Code for the paper "Language Models are Unsupervised Multitask Learners"
An implementation of model parallel GPT2& GPT3-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the mesh-tensorflow library.
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Build Graph Nets in Tensorflow
:exclamation:Ā ThisĀ isĀ aĀ read-onlyĀ mirrorĀ ofĀ theĀ CRANĀ RĀ packageĀ repository. hsmmĀ āĀ HiddenĀ SemiĀ MarkovĀ Models
ML has an impact on the climate. But not all models are born equal. Compute your model's emissions with our calculator and add the results to your paper with our generated latex template
scrapes data from https://database.lichess.org/ and converts it to json
Inference code for LLaMA models
A framework for few-shot evaluation of autoregressive language models.
My Initial Attempt at a Magic: the Gathering Draft AI
MAGMA - a GPT-style multimodal model that can understand any combination of images and language
Ongoing research training transformer language models at scale, including: BERT & GPT-2