Comments (7)
My guess is that this is due to mismatched versions between the scheduler and client. Can you verify that your client process has the same version of distributed that is in the environment that you're giving to the YarnCluster
?
from dask-yarn.
I just checked the distributed versions on client and environment sides and they appear to be the same
pip freeze | grep distributed
distributed==1.23.3
As a matter of fact I made the anaconda environment tar.gz file few days ago so it's fresh. I also checked the tornado and skein versions and they are also the same.
from dask-yarn.
Update. I tried the same code under Python 3 and It works great. The scenario I tried initially was as follows: I installed Anaconda2 (python 2), created environment for YARN from this anaconda, called the client from anaconda's python 2 and got the error I described above. Repeating the same steps but with Anaconda 3 doens't produce any errors. So the problem is with Python 2.
As it's indicated in the scheduler log the dask on yarn uses Python 3 (environment/lib/python3.6
) even though the environment was created and packed into tar.gz file under Python 2.
from dask-yarn.
Ah, that's it then. This has nothing to do with dask-yarn
, this is general across all of Dask: you can't use Python 3 for the scheduler when you're using Python 2 for the client. Versions of all packages (including Python) should be the same across the scheduler, client, and workers.
As it's indicated in the scheduler log the dask on yarn uses Python 3 (environment/lib/python3.6) even though the environment was created and packed into tar.gz file under Python 2.
An environment created with conda
and conda-pack
is completely self contained, I don't see how an environment that packaged Python 2 could be running Python 3 - I suspect the environment you packaged is not the same as the one you're running locally. Is there a possibility that you may have created or packaged the environment incorrectly?
from dask-yarn.
Well, I think the environment was packed correctly in some sense. I just discovered that anaconda2 creates environment with Python 3 being the default version even though Python 2 seems to be more logical default version created from anaconda 2.
At the end of the day mystery is solved and the issue can be closed.
from dask-yarn.
Glad to hear you figured things out.
from dask-yarn.
Is it possible to install dask-yarn on python2?
SergeySmith said he did it.
from dask-yarn.
Related Issues (20)
- AWS EMR bootstrap script fails HOT 5
- Conda environment does not activate HOT 1
- Dask Scheduler host/port Not Written to Skein Key-Value Storage When YARN Application Restarts HOT 5
- Move default branch from "master" -> "main" HOT 1
- YarnCluster.shutdown() Won't Work on EMR, results in `concurrent.futures._base.CancelledError` HOT 1
- Verify that Read the Docs is building after master -> main HOT 7
- YarnCluster hangs HOT 11
- wait_for_workers got stuck when to create cluster but application failed on yarn HOT 3
- dask-yarn job fails with dumps_msgpack ImportError HOT 3
- register workers of scheduler are less than workers in dashborad HOT 1
- can't upload files HOT 2
- EMR 6.3.0 Bootstrap Action BOOTSTRAP_FAILURE : Python 3.9 support? HOT 3
- Application Failure When Submitting Dask-Yarn Model Inferencing Job Remotely
- FileNotFoundError: [Errno 2] No such file or directory: 'yarn' HOT 3
- Jupyter Notebook Cell Hangs after submitting job to remote EMR cluster
- distributed 2022.3.0 no more compatible with dask-yarn because of missing "status" attribute in YarnCluster HOT 7
- YarnCluster() does not initialize but runs indefinetly HOT 3
- AttributeError while running dask on amazon EMR. HOT 3
- .skein.sh: line 2: environment/bin/python: No such file or directory HOT 4
- Bootstrapping for 40min, when use the script. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dask-yarn.