π Vincent D. Warmerdam β£ββ π¦ Open Source Packages β β£ββ bulk - simple bulk labelling interface β β£ββ embetter - embeddings ready for sklearn β β£ββ doubtlab - suite of tools to help find bad labels β β£ββ drawdata - draw datasets in jupyter β β£ββ scikit-lego - lego bricks for sklearn β β£ββ scikit-partial - partial_fit() pipelines for sklearn β β£ββ scikit-bloom - bloom transformers for sklearn β β£ββ human-learn - rule-based components for sklearn β β£ββ sentence-models - a different take on textcat β β£ββ mktestdocs - turn markdown files into pytest tests β β£ββ lazylines - lightweight utils for .jsonl wrangling β β£ββ cluestar - inspiration for your first text labels β β£ββ durations - pytest duration insights β β£ββ tuilwindcss - tailwindcss for textual tui apps β β£ββ memo - saves a whole log of time β β£ββ skedulord - makes cron a bit more fun β β£ββ icepickle - cool and safe storage for linear models β βββ evol - grammar for genetic heuristics β£ββ π Project Contributions β β£ββ fairlearn - contributed the CorrelationFilter β β£ββ polars - contributed the .pipe() method β βββ BERTopic - added lightweight sklearn pipeline support β£ββ β Online Projects β β£ββ calmcode.io - intermediate developer education β β£ββ koaning.io - personal blog β βββ dearme.email - reflection via a 30 day delay β£ββ ποΈ Popular Talks β β£ββ Natural Intelligence is All You Need β β£ββ Group-by statements that save the day β β£ββ Tools to Improve Training Data β β£ββ Optimal on Paper, Broken in Reality β β£ββ Playing by the Rules-Based-Systems β β£ββ How to Constrain Artificial Stupidity β β£ββ The Profession of Solving the Wrong Problem β β£ββ Winning with Simple, even Linear, Models β βββ Untitled12.ipynb β£ββ π¬ Random Experiments β β£ββ scikit-prune - prune scikit learn pipelines β β£ββ gitlit - tracking github action times across open source β β£ββ sentimany - many sentiment models, one repo β β£ββ tokenwiser - sklearn token tricks β β£ββ clumper - functional API for lists of dicts β βββ whatlies - exploration tools for word embeddings βββ π¨βπ» Employer β£ββ π² :probabl. - scikit-learn and friends β β£ββ scikit-churn - safety rails for churn work β βββ scikit-playtime - rethinking pipelines β£ββ π₯ Explosion - developer tools for nlp β β£ββ prodigy-hf - Prodigy integration for the HuggingFace stack β β£ββ prodigy-pdf - Annotate PDFs via Prodigy β β£ββ prodigy-ann - ANN techniques to find relevant subsets β β£ββ prodigy-segment - Prodigy integration for Segment Anything β β£ββ prodigy-lunr - Search techniques to find relevant subsets β β£ββ prodigy-whisper - Transcribe audio with OpenAI's whisper models β β£ββ prodigy-tui - Prodigy from the terminal β βββ cluestar - inspiration for your first text labels βββ π€ Rasa - conversational software provider β£ββ nlu examples - custom nlu components for Rasa β£ββ taipo - data augmentation tools βββ algo whiteboard - nlp education Follow me on twitter @fishnets88
koaning Goto Github PK
Name: vincent d warmerdam
Type: User
Company: @explosion
Bio: Solving problems involving data. Mostly NLP these days. AskMeAnything[tm].
Twitter: fishnets88
Location: Amsterdam
Blog: https://koaning.io