Neel Nanda's Projects
Mechanistic Interpretability Visualizations using React
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
A Mechanistic Interpretability Analysis of Grokking
Training Sparse Autoencoders on Language Models
Mover Heads
A very hacky set of functions for getting plotly to do what I want when doing mech interp research, designed to be compatible with PyTorch
Random utils for personal use
Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons
A wrapper around Plotly for various utilities with Mechanistic Interpretability Research