Git Product home page Git Product logo

Comments (5)

TomAugspurger avatar TomAugspurger commented on August 17, 2024

dask/dask-ml#158

Can you provide a reproducible example of a failure? This succeeds

import dask.array as da
import sklearn.datasets
import sklearn.cluster
from sklearn.externals import joblib
from distributed import Client

from distributed import Client
client = Client()

X, y = sklearn.datasets.make_blobs()

model = sklearn.cluster.DBSCAN(eps=0.5, min_samples=3)

with joblib.parallel_backend("dask"):
    model.fit(X)

from dask-tutorial.

ameyyadav09 avatar ameyyadav09 commented on August 17, 2024
from dask.distributed import Client
from sklearn.externals.joblib import parallel_backend
from sklearn.datasets import make_blobs
from sklearn.cluster import DBSCAN
import datetime

if __name__ == '__main__':
    X, y = make_blobs(n_samples = 150000, n_features = 2, centers = 3, cluster_std = 2.1)
    
    client = Client()
    now = datetime.datetime.now()
    model = DBSCAN(eps = 0.5, min_samples = 30)
    with parallel_backend('dask'):
        model.fit(X)
    print(datetime.datetime.now() - now)

Below is my output

distributed.worker - WARNING - Compute Failed
Function: <sklearn.externals.joblib._dask.Batch object at 0x7f884869b1d0>
args: (array([[ 3.12448708, -4.43752312],
[ 4.89858449, -3.96334534],
[-9.70246128, 7.82301076],
...,
[ 6.25643046, -3.93627323],
[10.77439621, -5.29284763],
[-7.0445401 , 11.64406627]]))
kwargs: {}
Exception: TimeoutError('Timeout',)

and I had to stop the program manually (with crtl + C). Am I doing it wrong !
And the code you mentioned above fails to work in windows. I tried the same code on linux it was fine. Does it have anything to do with OS too !

from dask-tutorial.

mrocklin avatar mrocklin commented on August 17, 2024

Worked for me

In [1]: from dask.distributed import Client
   ...: from sklearn.externals.joblib import parallel_backend
   ...: from sklearn.datasets import make_blobs
   ...: from sklearn.cluster import DBSCAN
   ...: 

In [2]: import datetime
   ...: 

In [3]:     X, y = make_blobs(n_samples = 150000, n_features = 2, centers = 3, c
   ...: luster_std = 2.1)
   ...:     
   ...:     client = Client()
   ...:     now = datetime.datetime.now()
   ...:     model = DBSCAN(eps = 0.5, min_samples = 30)
   ...:     with parallel_backend('dask'):
   ...:         model.fit(X)
   ...:     print(datetime.datetime.now() - now)
   ...: 
0:00:12.678909

You might try updating versions of scikit learn, dask, and distributed to see if that helps

from dask-tutorial.

iremozen-edremit avatar iremozen-edremit commented on August 17, 2024

Hi @mrocklin
I tried to run exactly the same code with you. But It just kept running a very long time and I shut it down. Do you think what is the problem?

from dask-tutorial.

TomAugspurger avatar TomAugspurger commented on August 17, 2024

Hard to say in the abstract. You might try increasing the verbosity on the sklearn estimator and looking at the logs

from dask-tutorial.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.