Git Product home page Git Product logo

coresets's Introduction

Coresets

This library contains the implementation coreset generation for k-Means and (Bayesian) Gaussian mixture models. It also offers the extended versions of the corresponding algorithms that support weighted data sets.

To get started, take a look at:

examples/intro.ipynb

Setup

pip install -r requirements.txt
python setup.py build_ext --inplace

Running tests

In project root run:

python -m pytest tests/ 

References

The implementation of the library is based on the following works:

Bachem, O., Lucic, M., & Krause, A. (2017). Practical coreset constructions for machine learning. arXiv preprint arXiv:1703.06476.

Bachem, O., Lucic, M., & Krause, A. (2017). Scalable and distributed clustering via lightweight coresets. arXiv preprint arXiv:1702.08248.

Lucic, M., Faulkner, M., Krause, A., & Feldman, D. (2018). Training Gaussian Mixture Models at Scale via Coresets. Journal of Machine Learning Research, 18, Art-No.

Borsos, Z., Bachem, O., & Krause, A. Variational Inference for DPGMM with Coresets. (2017). Advances in Approximate Bayesian Inference

coresets's People

Contributors

sebascuri avatar zalanborsos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

coresets's Issues

k_means import from scikit-learn not found in test

I have scikit_learn installed, but the tests fail:

Traceback:
../../importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_mixtures.py:6: in <module>
    from algorithms import WeightedGaussianMixture, WeightedBayesianGaussianMixture
algorithms/__init__.py:3: in <module>
    from algorithms.weighted_kmeans import WeightedKMeans
algorithms/weighted_kmeans.py:2: in <module>
    from sklearn.cluster.k_means_ import _init_centroids
E   ModuleNotFoundError: No module named 'sklearn.cluster.k_means_'
__________________________________________ ERROR collecting tests/test_sensitivity.py __________________________________________
ImportError while importing test module '/home/icb/karin.hrovatin/miniconda3/envs/subsampling/lib/python3.8/site-packages/coresets/tests/test_sensitivity.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_sensitivity.py:7: in <module>
    from coresets import *
coresets/__init__.py:4: in <module>
    from coresets.k_means_coreset import KMeansLightweightCoreset, KMeansCoreset, KMeansUniformCoreset
coresets/k_means_coreset.py:1: in <module>
    from sklearn.cluster.k_means_ import _init_centroids
E   ModuleNotFoundError: No module named 'sklearn.cluster.k_means_'
________________________________________ ERROR collecting tests/test_weighted_kmeans.py ________________________________________
ImportError while importing test module '/home/icb/karin.hrovatin/miniconda3/envs/subsampling/lib/python3.8/site-packages/coresets/tests/test_weighted_kmeans.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_weighted_kmeans.py:5: in <module>
    from algorithms import weighted_kmeans
algorithms/__init__.py:3: in <module>
    from algorithms.weighted_kmeans import WeightedKMeans
algorithms/weighted_kmeans.py:2: in <module>
    from sklearn.cluster.k_means_ import _init_centroids
E   ModuleNotFoundError: No module named 'sklearn.cluster.k_means_'
=================================================== short test summary info ====================================================
ERROR tests/test_k_means_coresets.py
ERROR tests/test_mixtures.py
ERROR tests/test_sensitivity.py
ERROR tests/test_weighted_kmeans.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 4 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
====================================================== 4 errors in 1.35s =======================================================
(subsampling) [karin.hrovatin@icb-lisa coresets]$ pip freeze|grep scikit-learn
scikit-learn==0.24.1

Push to pypi

Hello,
Could we push this package on pip so that it is accessible?

Also, do you plan to add other types of coresets like Bayesian Coresets and so on?

Cheers,
Fred

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.