This repo studies the dataset complexities for a variety of datasets in SE and ML, using multiple different metrics.
data/
: Store the data here. Due to size restrictions, we cannot upload all our data in this repo.output/
: Output for each metric and dataset.src/
: Source code
- Install project with Poetry
poetry install
- Please set up
pre-commit
:
poetry run pre-commit install
- We use
ruff
to lint our code: see the instructions for installation.