The ActiveDP project automatically labels large datasets for training ML models. It combines two different paragidms: Active Learning and Data Programming, to generate labels with high accuracy and coverage.
pip install torch torchaudio torchvision
pip install scikit-learn pandas tqdm optuna sentence-transformers snorkel
pip install wandb matplotlib nltk cdt alipy
python icws.py --dataset Youtube --filter-method Glasso --al-model logistic --use-valid-labels
Datasets used for evaluation can be downloaded here