snimu Goto Github PK
Name: Sebastian Müller
Type: User
Company: BMW
Bio: Deep Learning at BMW.
Twitter: omouamoua
Location: Munich
Name: Sebastian Müller
Type: User
Company: BMW
Bio: Deep Learning at BMW.
Twitter: omouamoua
Location: Munich
I'm playing around with Attention mechanisms
My blog
DSPy: The framework for programming—not prompting—foundation models
Red-Teaming Language Models with DSPy
Embracing the bitter lesson (vision)
Round the gradient during LLM training to different degrees; compare "scaling" of rounding to different significant digits to parameter scaling
Trying out the grokfast algorithm on LLMs
Python: guarantee test coverage, guarantee type and runtime-guarantees
Train to 94% on CIFAR-10 in ~6.84 seconds on a single A100, the current world speed record. Or ~95.78% in ~114 seconds (or less!)
Minimalistic, fast, and experimentation-friendly researcher's toolbench for GPT-like models in ~<365 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in ~138 seconds.
CLI controllable version of hlb-gpt by tysam-code
Check out how much of a difference the activation of the value makes vs. keeping it linear as in standard attention
Ablate KAN and Fourier KAN vs. normal Linear Layers in LLMs
How do parameter statistics change over training in LLMs?
1. Train small LLM; 2. Use its outputs on the training data as labels for training large LLM, where their argmax agrees with the training data.
A framework for few-shot evaluation of language models.
Some experiments with Attention masks
Sort lists with the help of an ANN to allow maximal parallelism in execution.
Extend typehints to include dynamic checks (that might otherwise be dealt with by assertions) in Python.
A better way for LLMs to plan before acting.
Fix issue #19981 on PX4-Autopilot
Apply methods described in "Git Re-basin"-paper [1] to arbitrary models --- [1] Ainsworth et al. (https://arxiv.org/abs/2209.04836)
Results for snimu/rebasin
SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with LLMs faster and more controllable.
Performance benchmark for PyTorch models
Easily manipulate torch.Tensors inside highly nested data-structures.
View model summaries in PyTorch!
Executable typehints for Python: make assertions about and/or modify parameters & return values
How much information can we extract from one token?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.