soravux / scoop Goto Github PK
View Code? Open in Web Editor NEWSCOOP (Scalable COncurrent Operations in Python)
Home Page: https://github.com/soravux/scoop
License: GNU Lesser General Public License v3.0
SCOOP (Scalable COncurrent Operations in Python)
Home Page: https://github.com/soravux/scoop
License: GNU Lesser General Public License v3.0
Future exceptions are not propagating tracebacks correctly
Patch attached.
Original issue reported on code.google.com by [email protected]
on 4 Jun 2013 at 6:59
Attachments:
I'm running a series of experiments with scoop on a slurm cluster.
Tonight some of my tasks seem to have run out of memory:
Traceback (most recent call last):
File "/software/python/2.7.12/lib/python2.7/logging/__init__.py", line 872, in emit
Bad address (bundled/zeromq/src/tcp.cpp:244)
stream.write(ufs % msg)
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/codecs.py", line 706, in write
return self.writer.write(data)
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/codecs.py", line 370, in write
self.stream.write(data)
IOError: [Errno 12] Cannot allocate memory
...
Traceback (most recent call last):
File "/software/python/2.7.12/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/software/python/2.7.12/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
b.main()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
futures_startup()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
run_name="__main__"
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_control.py", line 231, in runController
future = execQueue.pop()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_types.py", line 320, in pop
self.updateQueue()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_types.py", line 343, in updateQueue
for future in self.socket.recvFuture():
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 279, in recvFuture
received = self._recv()
File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 188, in _recv
thisFuture = pickle.loads(msg[1])
IndexError: list index out of range
The main issue here is that it seems as if scoop did not completely terminate, but remains running in a locked up state (0 load) for hours.
Hi, it would be nice if constants that were once set via shared.setConst(myconst=42)
could also be deleted, i.e. via a function shared.delConst('myconst')
which propagates to all the workers.
This would be useful in case of constants which are huge but which are no longer needed. Deleting them would free memory on all workers.
What steps will reproduce the problem?
1. Run futures.map with a function with both iterator and keyword arguments
2. return value from function
What is the expected output? What do you see instead?
should run correctly giving output of program. Instead I get
TypeError: submit() got an unexpected keyword argument
What version of the product are you using? On what operating system?
0.7.0 RC
Please provide any additional information below.
I've attached a futures.py that seems to have fixed the problem.
Original issue reported on code.google.com by [email protected]
on 8 Oct 2013 at 2:35
Attachments:
python -m 'scoop'
does not start scoop properly. I also get 'Be sure to start your program with
the '-m scoop' parameter. You can find further information in the
documentation.' when I actually try to run something using futures.map
(meteng)megatron-5390:examples niko$ python -m 'scoop'
[2014-02-18 14:22:28,739] launcher INFO SCOOP 0.7.0 release on darwin using
Python 2.7.5 (default, Aug 25 2013, 00:04:04) [GCC 4.2.1 Compatible Apple LLVM
5.0 (clang-500.0.68)], API: 1013
[2014-02-18 14:22:28,739] launcher INFO Deploying 4 worker(s) over 1
host(s).
[2014-02-18 14:22:28,740] launcher INFO Worker distribution:
[2014-02-18 14:22:28,740] launcher INFO 127.0.0.1: 3 + origin
[2014-02-18 14:22:29,019] __init__ INFO Launching advertiser...
[2014-02-18 14:22:29,020] __init__ INFO Advertiser launched.
Exception in thread Thread-3:
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 808, in __bootstrap_inner
self.run()
File "/Users/niko/.virtualenvs/meteng/lib/python2.7/site-packages/scoop/discovery/minusconf.py", line 279, in run
self._init_advertiser()
File "/Users/niko/.virtualenvs/meteng/lib/python2.7/site-packages/scoop/discovery/minusconf.py", line 252, in _init_advertiser
super(ConcurrentAdvertiser, self)._init_advertiser()
File "/Users/niko/.virtualenvs/meteng/lib/python2.7/site-packages/scoop/discovery/minusconf.py", line 185, in _init_advertiser
sock.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_LOOP, struct.pack('@I', 1))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 22] Invalid argument
and here some pip freeze output in case it help
Flask==0.10.1
Flask-Cache==0.12
Flask-Login==0.2.9
Flask-Migrate==1.2.0
-e
[email protected]:biosustain/flask-presst.git@58505be0f7dd3b6efdae74eeea81e63e3
106b18f#egg=Flask_Presst-master
Flask-Principal==0.4.0
Flask-RESTful==0.2.10
Flask-Redis==0.0.3
Flask-SQLAlchemy==1.0
Flask-Script==0.6.6
Flask-WTF==0.9.2
Jinja2==2.7.2
Mako==0.9.1
Markdown==2.3.1
MarkupSafe==0.18
Mosek==7.0.90
PdbSublimeTextSupport==0.2
PyDrive==1.0.0
PyYAML==3.10
Pygments==1.6
## !! Could not determine repository location
RESTfulCOBRA==0.1.0
SQLAlchemy==0.9.1
Sphinx==1.2.1
Unidecode==0.04.14
WTForms==1.0.5
Werkzeug==0.9.4
alembic==0.6.2
amqp==1.0.13
aniso8601==0.82
anyjson==0.3.3
argparse==1.2.1
astroid==1.0.1
beautifulsoup4==4.3.2
benchmark==0.1.5
billiard==2.7.3.34
biopython==1.62
blessings==1.5.1
blinker==1.3
bokeh==0.3
cameo==v0.0.0
celery==3.0.24
celery-with-redis==3.0
-e
[email protected]:phantomas1234/cobrapy.git@37768297c38d99b32429dcff1bed9ebaa21
82de5#egg=cobra-master
columnize==0.3.6
coverage==3.7
cplex==12.5.1.0
cvxopt==1.1.6
dataset==0.4.0
deap==1.0.0rc2
dill==0.2b1
distribute==0.7.3
docutils==0.11
flask-sse==0.1
framed==0.0.0
gdata==2.0.18
gevent==0.13.8
glpk==0.3
google-api-python-client==1.2
greenlet==0.4.1
gunicorn==18.0
gurobipy==5.5.0
honcho==0.5.0
httplib2==0.8
import-relative==0.2.3
inspyred==1.0
ipdb==0.8
ipdbplugin==1.4
ipython==2.0.0-dev
ipython-cluster-helper==0.2.10
iso8601==0.1.8
itsdangerous==0.23
kombu==2.5.16
logilab-common==0.60.1
matplotlib==1.3.1
networkx==1.8.1
nose==1.3.0
nose-progressive==1.5
numpy==1.7.1
numpydoc==0.4
-e
[email protected]:biosustain/optlang.git@ace6f3ce05acbb52dafb3663c9c54756c50af4
13#egg=optlang-master
pandas==0.13.0
piprot==0.2.0
plotly==0.5.7
ply==3.4
progressbar==2.3
psycopg2==2.5.2
pyDOE==0.3
pydbgr==0.2.6
pyficache==0.2.3
pylint==1.1.0
pymongo==2.6.3
-e
git+https://github.com/Midnighter/pyorganism.git@88a57182ed382bcd5a508a252d8ddc7
6678d90c1#egg=pyorganism-niko_branch
pyparsing==2.0.1
python-dateutil==2.2
python-ldap==2.4.13
python-memcached==1.53
python-slugify==0.0.7
python-termstyle==0.1.10
pytz==2013.9
pyzmq==13.1.0
radar==0.3
readline==6.2.4.1
redis==2.9.1
rednose==0.4.1
requests==2.2.1
scipy==0.13.0
scoop==0.7.0.release
six==1.5.2
smartypants==1.8.3
sphinx-bootstrap-theme==0.3.6
sphinx-rtd-theme==0.1.5
sse==1.2
sympy==0.7.3
tornado==3.1.1
tracer==0.3.2
-e
[email protected]:phantomas1234/escher.git@00cbd54c142ffd7dd5e635f721223d94b18c
7282#egg=visbio-master
wsgiref==0.1.2
yaposib==0.3.2
Original issue reported on code.google.com by [email protected]
on 18 Feb 2014 at 1:27
Output of pip install scoop
:
Collecting scoop
Could not find a version that satisfies the requirement scoop (from versions: 0.7.0.release, 0.7.1.release)
Some externally hosted files were ignored as access to them may be unreliable (use --allow-external scoop to allow).
No matching distribution found for scoop
Python version is 2.7.9
I want to run a simple script in a cluster whose network transmission rate is relatively low. Processes in my script, if created by multiprocess module, have little communication with each other.
Will scoop need a lot of IO over network during execution, which will create a lot of overhead here?
What steps will reproduce the problem?
run scoop as a module and --log log.txt
What is the expected output? What do you see instead?
Write all logging to a file. Instead:
python3 -m scoop -n 2 --log log.txt test.py
[-h] [--hosts [Address [Address ...]] | --hostfile FileName]
[--path PATH] [--nice NiceLevel] [--verbose] [--quiet]
[-n NumberOfWorkers] [-b NumberOfBrokers] [--tunnel]
[--external-hostname Address] [--python-interpreter Path]
[--pythonpath PYTHONPATH] [--prolog PROLOG] [--profile]
[--backend {ZMQ,TCP}]
[executable] ...
python3 -m scoop: error: unrecognized arguments: --log
What version of the product are you using? On what operating system?
0.72 on linux
Please provide any additional information below.
The devel version (0.72) seems to have lost the log file option on the command
line. This option is still mentioned in the documentation:
http://scoop.readthedocs.org/en/0.7/usage.html
Original issue reported on code.google.com by [email protected]
on 25 Mar 2014 at 9:13
What steps will reproduce the problem?
1. dunno, I get it after several hours of usage
What is the expected output? What do you see instead?
nothing
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 298, in <module>
b.main()
File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 285, in run
futures_startup()
File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 266, in futures_startup
run_name="__main__"
File "/home/ed/projects/equalog/scoop/scoop/futures.py", line 65, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/home/ed/projects/equalog/scoop/scoop/_control.py", line 259, in runController
future = execQueue.pop()
File "/home/ed/projects/equalog/scoop/scoop/_types.py", line 352, in pop
self.updateQueue()
File "/home/ed/projects/equalog/scoop/scoop/_types.py", line 375, in updateQueue
for future in self.socket.recvFuture():
File "/home/ed/projects/equalog/scoop/scoop/_comm/scoopzmq.py", line 353, in recvFuture
received = self._recv()
File "/home/ed/projects/equalog/scoop/scoop/_comm/scoopzmq.py", line 237, in _recv
thisFuture = pickle.loads(msg[1])
cPickle.UnpicklingError: pickle data was truncated
What version of the product are you using? On what operating system?
pip reports 0.7.2.dev, but I am using the latest from hg.
Original issue reported on code.google.com by [email protected]
on 16 Nov 2014 at 7:16
Hello,
I apologize if this is due to some misconfiguration from my part (most likely), but I can't seem to be able to import scoop in mac os x. I get the following error
(gen) nettrino$ python
Python 3.5.1 (default, Apr 18 2016, 11:46:32)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
import scoop
Traceback (most recent call last):
File "", line 1, in
File "/Users/nettrino/projects/test/scoop.py", line 1, in
from scoop import futures
ImportError: cannot import name 'futures
Here is my pip packages (I'm in a virtual environment with Python3 but same happens with Python 2)
greenlet (0.4.9)
numpy (1.11.0)
pip (8.1.1)
pyzmq (15.2.0)
scoop (0.7.1.1)
setuptools (20.9.0)
wheel (0.29.0)
nettrino$ python --version
Python 3.5.1
Any hints would be much appreciated
Update the link to scripts (http://scoop.readthedocs.org/en/0.7/usage.html#use-with-a-scheduler) to the submission scripts on github (https://github.com/soravux/scoop/tree/master/examples/submit_files).
When $SLURM_JOB_NODELIST is e.g. "nodes[006,011]" I get the following error:
File "/python27/lib/python2.7/site-packages/scoop/utils.py", line 209, in parseSLURM
bmin,bmax = rng.split('-')
ValueError: need more than 1 value to unpack
If $SLURM_JOB_NODELIST is e.g. "nodes[021-022]" the workers are deployed over the two hosts.
What i don't understand is using scoop in existing projects.
How can i push data and get data, if i want to call from existing project?
SCOOP cannot be used as an RPC it seems , have to integrate with some other RPC
Libraries?
Original issue reported on code.google.com by phyo.arkarlwin
on 6 Mar 2013 at 3:15
I have noted the following problem.
I have code in my .bashrc that echo's a status message. When ssh-ing to this machine while dispatching jobs, SCOOP appears to be confused by this message and appears unable to launch jobs on this host effectively. (I get an error message specifically mentioning the bashrc status message. From what I could see, Only one job per node could be launched in such a case. Is there any way to fix this without having to disable the status message? (The message is rather useful when sshing to the machine in other contexts). Upon disabling the output, the error message disappears and the scheduling works fine.
Hi,
I am trying to interpret the graphics produced by bench/process_debug.py
after running my application with the --debug
flag. It seems like a useful tool, but I want to interpret the results correctly.
I ran the app on a single machine with 4 CPUs and 4 workers.
The density_debug
plot shows that worker 0 has mostly a density of '2' whereas the other workers seem to have a density of '1'.
Does a density of 2
on a worker process mean that at a given time 2 tasks are competing for the same resources within a single process? Is it correct to deduce that in this case, worker 0 refers to the root worker that runs both root future as well as a future that consumes tasks?
Next question is about the timeline_debug
plot:
As this plot also shows a metric per worker process as a function of time, I was wondering what the differences are between this plot, and density_debug plot? From the visualization of timeline_debug
plot it seems that the workers are far less busy doing stuff than from the density_debug
plot. Or should I interpret the blue bars differently here?
Any hints on this?
Thanks!
I am using decorated functions within the DEAP framework to limit the tree size and it seems SCOOP does not support pickling decorated functions.
@apply_decorator
def mutUniform(*args, **kwargs):
return gp.mutUniform(*args, **kwargs)
@apply_decorator
def mutUniform(*args, **kwargs):
return gp.mutUniform(*args, **kwargs)
toolbox.register("mutate", mutUniform, expr=toolbox.expr_mut, pset=pset)
output:
[2017-02-21 11:47:39,593] scoopzmq (b'127.0.0.1:57909') WARNING Pickling Error: Can't pickle <function mutUniform at 0x7f8dc2e7a730>: it's not the same object as __main__.mutUniform
scoop._comm.scoopexceptions.ReferenceBroken: This element could not be pickled: FutureId(worker=b'127.0.0.1:57909', rank=1):partial(0,)=None.
[2017-02-21 11:47:42,399] scoopzmq (b'127.0.0.1:57909') ERROR A worker exited unexpectedly. Read the worker logs for more information. SCOOP pool will now shutdown.
My code works fine with the built in Python 3 multiprocessing pool.map function btw.
I have been using SCOOP with DEAP for while now, and I really enjoy how simple it is. I have run into one annoying problem that I am hoping is a simple fix.
When I run a script that uses SCOOP at home (or most other internet connections) it works fine. I was running into trouble when I tried to run the same scripts on my work's internet connection. When I switched to the public internet of the building next door, it began working again. I get the same SCOOP error when I am on my work's internet when I disable my internet connection all together.
I believe I read somewhere that SCOOP uses networking to communicate between workers. I also know that SCOOP has the capability of pinging out to other machines on the local network to see if it is being called as part of a cluster. I assume something in this realm is the source of the issue.
I am hoping there is a simple way to tell SCOOP to only use the cores on the local machine, and not to attempt to reach out across the network, as some connections (or no connections) make that impossible.
If such a workaround does or does not exist, I think it would be nice for the error reported to the user to explain more about what has gone wrong, and perhaps point to specific conditions which SCOOP must be run under.
Or perhaps I am completely wrong about what is happening here. Any help would be much appreciated.
Below is the error that I get when I run a script that uses SCOOP (python -m scoop my_file.py
) at work or with my internet connection disabled:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scoop/__main__.py", line 18, in <module>
from .launcher import main
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scoop/launcher.py", line 31, in <module>
from scoop import utils
My program used to pass kwargs argument to the function using futures.submit
when I was using version 0.6.2. When I upgrade to 0.7.2 this possibility was no
longer there. I checked the source code and noticed that most functions stopped
processing kwargs argument.
Since my program relied heavily on this feature I re-enabled kwargs support in
futures , see attached patch.
Original issue reported on code.google.com by [email protected]
on 4 Jun 2014 at 5:31
Attachments:
I am try to run with a single host and multiple workers on Python 3.6.0. After the program starts, the main worker keeps running and all the other workers have the error shown below:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 223, in prepare
_fixup_main_from_name(data['init_main_from_name'])
File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 249, in _fixup_main_from_name
alter_sys=True)
File "C:\ProgramData\Anaconda3\lib\runpy.py", line 201, in run_module
mod_name, mod_spec, code = _get_module_details(mod_name)
File "C:\ProgramData\Anaconda3\lib\runpy.py", line 136, in _get_module_details
raise error("No module named %s" % mod_name)
ImportError: No module named SCOOP_WORKER
When the same code is ran in python 2.7, this ImportError does not appear and the code runs properly.
What is causing this issue to only occur in Python 3.6.0?
What is SCOOP_WORKER?
When installing scoop
with its setup.py
it requests argparse>=1.1
in install_require
. That downloads and installs arparse 1.2.1
even when using python 2.7
which already has its own argparse 1.1
.
This is a serious problem, because argparse 1.2.1
is old and bugous, and after it is installed, it is used instead of the system one, causing hard-to-track-down troubles such as the one described http://stackoverflow.com/questions/29374044/
So, please change the requirements using a conditional append, e.g.
if sys.version_info < something:
install_requires. append('argparse>=1.1')
Moreover, you may want to investigate why that requirement pulls argparse 1.2.1
when on PyPI there is 1.2.2
and 1.3
available.
Dear Yannick,
Would it be possible to set the 'ssh' part of BASE_SSH (class Host, launch/workerLaunch.py) with an environment variable (e.g. SCOOP_SSH) if it's defined?
This is helpful in HPC environments that require the use of a wrapper around ssh (passing all arguments down to ssh).
Thank you!
For some reason, I'm getting an error because python is trying to serialize greenlets:
Traceback (most recent call last):
File "/home/apps/Logiciels/Python/python-3.5.1/lib/python3.5/runpy.py", line 170, in _run_module_as_main
"__main__", mod_spec)
File "/home/apps/Logiciels/Python/python-3.5.1/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/lochar/python/lib/python3.5/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
b.main()
File "/home/lochar/python/lib/python3.5/site-packages/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/home/lochar/python/lib/python3.5/site-packages/scoop/bootstrap/__main__.py", line 290, in run
futures_startup()
File "/home/lochar/python/lib/python3.5/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
run_name="__main__"
File "/home/lochar/python/lib/python3.5/site-packages/scoop/futures.py", line 64, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/home/lochar/python/lib/python3.5/site-packages/scoop/_control.py", line 230, in runController
execQueue.sendResult(future)
File "/home/lochar/python/lib/python3.5/site-packages/scoop/_types.py", line 384, in sendResult
self.socket.sendResult(future)
File "/home/lochar/python/lib/python3.5/site-packages/scoop/_comm/scoopzmq.py", line 319, in sendResult
pickle.HIGHEST_PROTOCOL,
TypeError: cannot serialize 'greenlet.greenlet' object
I'm not using greenlets, so I'm not sure why I'm getting this. I'm really trying to get started with scoop—I've never used mapreduce nor done parallelized code before, so I might also be making an obvious error. Here is my code:
import os
import re
from scoop import futures as fut
def get_sideeffect_sentences(f):
sideeffect = re.compile(r"\bside\s+effects?\b")
tagsplit = re.compile(r"\s*<[hp]>\s*")
for ln in open(f):
for sent in tagsplit.split(ln):
if sideeffect.search(sent.lower()):
yield sent
def writeout(x, y):
print(y, file=outf)
if __name__ == "__main__":
scratch = os.getenv("SCRATCH")
outf = open(scratch + "/now/sideeffect_sents.txt", "w")
lsfiles = [ scratch + "/now/text/" + f for f in os.listdir(scratch + "/now/text") ]
fut.mapReduce(get_sideeffect_sentences, writeout, lsfiles)
outf.close()
The command is "python -m scoop -n 8". I noticed there are Québécois-sounding last names in the docs, so perhaps it might help to say I've been trying that on Briarée (Calcul Québec computer).
What steps will reproduce the problem?
1. I have no idea, this is the 2nd time I've seen it though, will running SCOOP
for about a couple of days
What version of the product are you using? On what operating system?
pip reports 0.7.2.dev, but I am using the latest from hg.
Please provide any additional information below.
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ed/projects/equalog/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 298, in <module>
b.main()
File "/home/ed/projects/equalog/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/home/ed/projects/equalog/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 285, in run
futures_startup()
File "/home/ed/projects/equalog/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 266, in futures_startup
run_name="__main__"
File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/futures.py", line 65, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/_control.py", line 259, in runController
future = execQueue.pop()
File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/_types.py", line 352, in pop
self.updateQueue()
File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/_types.py", line 375, in updateQueue
for future in self.socket.recvFuture():
File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 350, in recvFuture
received = self._recv()
File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 251, in _recv
if thisFuture.sendResultBack:
AttributeError: 'set' object has no attribute 'sendResultBack'
Original issue reported on code.google.com by [email protected]
on 14 Nov 2014 at 9:20
I couldn't use ssh host name.
I checked the parameter to pass the method getaddrinfo
.
scoop.BROKER.externalHostname
returned thx
and scoop.BROKER.task_port
returned random int.
How to use other computer through ssh with ~/.ssh/config
file? It looks not supported as long as I check the process.
ssh thx
is properly working.thx@thx-Prime:~/workspace/scoop$ python -m scoop --host thx -vv scoop_test.py
[2016-12-21 12:11:44,449] launcher INFO SCOOP 0.7 2.0 on linux2 using Python 2.7.12 (default, Oct 21 2016, 22:26:43) [GCC 5.4.0 20160609], API: 1013
[2016-12-21 12:11:44,449] launcher INFO Deploying 1 worker(s) over 1 host(s).
[2016-12-21 12:11:44,449] launcher DEBUG Using hostname/ip: "thx" as external broker reference.
[2016-12-21 12:11:44,449] launcher DEBUG The python executable to execute the program with is: /home/thx/.pyenv/versions/2.7.12/bin/python.
[2016-12-21 12:11:44,449] launcher INFO Worker d--istribution:
[2016-12-21 12:11:44,449] launcher INFO thx: 0 + origin
[2016-12-21 12:11:44,449] brokerLaunch DEBUG Launching remote broker: ssh -tt -x -oStrictHostKeyChecking=no -oBatchMode=yes -oUserKnownHostsFile=/dev/null -oServerAliveInterval=300 thx /home/thx/.pyenv/versions/2.7.12/bin/python -m scoop.broker.__main__ --echoGroup --echoPorts --backend ZMQ
[2016-12-21 12:11:44,750] brokerLaunch DEBUG Foreign broker launched on ports 46544, 45126 of host thx.
[2016-12-21 12:11:44,750] launcher DEBUG Initialising remote origin worker 1 [thx].
[2016-12-21 12:11:44,751] launcher DEBUG thx: Launching '/home/thx/.pyenv/versions/2.7.12/bin/python -m scoop.launch.__main__ 1 3 --size 1 --workingDirectory "/home/thx/workspace/scoop" --brokerHostname 127.0.0.1 --externalBrokerHostname thx --taskPort 46544 --metaPort 45126 --origin --backend=ZMQ -vvv scoop_test.py'
Warning: Permanently added '192.168.21.10' (ECDSA) to the list of known hosts.
Launching 1 worker(s) using /bin/bash.
Executing '['/home/thx/.pyenv/versions/2.7.12/bin/python', '-m', 'scoop.bootstrap.__main__', '--size', '1', '--workingDirectory', '/home/thx/workspace/scoop', '--brokerHostname', '127.0.0.1', '--externalBrokerHostname', 'thx', '--taskPort', '46544', '--metaPort', '45126', '--origin', '--backend=ZMQ', '-vvv', 'scoop_test.py']'...
Traceback (most recent call last):
File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 298, in <module>
b.main()
File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 285, in run
futures_startup()
File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 266, in futures_startup
run_name="__main__"
File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/futures.py", line 65, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/_control.py", line 199, in runController
execQueue = FutureQueue()
File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/_types.py", line 264, in __init__
self.socket = Communicator()
File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 70, in __init__
info = socket.getaddrinfo(scoop.BROKER.externalHostname, scoop.BROKER.task_port)[0]
socket.gaierror: [Errno -2] Name or service not known
Exception AttributeError: "'FutureQueue' object has no attribute 'socket'" in <bound method FutureQueue.__del__ of <scoop._types.FutureQueue object at 0x7f5abe703a90>> ignored
Connection to 192.168.21.10 closed.
[2016-12-21 12:11:45,344] launcher INFO Root process is done.
[2016-12-21 12:11:45,345] workerLaunch DEBUG Closing workers on thx (1 workers).
[2016-12-21 12:11:45,345] brokerLaunch DEBUG Closing broker on host thx.
Warning: Permanently added '192.168.21.10' (ECDSA) to the list of known hosts.
Hi,
is there a way to run with different user and different port?
When I try to give [email protected] -p yyyy, it ends with
'ssh: connect to host xx.xx.xx.xx -p yyyy port 22: Connection timed out\r\n'
And I see the commandlines like setenv PYTHONPATH /home/user1/root/lib/:
which will not obviously work if user1 != user2
maybe there could be some local program (like in jug) that knows the local setting...
Thank you
What version of the product are you using? On what operating system?
pip reports 0.7.2.dev, but I am using the latest from hg.
Please provide any additional information below.
Scoop appears to hang for a couple of seconds after doing a
futures.map_as_completed on this function self.errors =
self.workers[0].subprocesses[0].wait() in launcher.py.
Is there anyway to speed this up?
Original issue reported on code.google.com by [email protected]
on 11 Nov 2014 at 9:47
I'm using scoop to distribute simulation jobs. For that, i need to distribute some files to the used hosts. Is it possible to get a list of currently used hosts in execution? If i use each thread to rsync the files to the local memory, this may end id too much connections to the origin host.
Thanks ;)
is there an easy way to run a function on each worker?
I have a scenario where i'd like to trigger writing stats on program termination on each worker. Each of the workers has a local cache etc. and syncing the stats during computation would cause a lot of stupid overhead.
What steps will reproduce the problem?
1. Run a mutli-node scoop run using using full domain names in the --hosts line
e.g.,
python -m scoop.__main__ --backend ZMQ -vv --hosts node1.default.domain
node2.default.domain -n 32 scoopCode.py
What is the expected output?
I would expect this command
python -m scoop.__main__ --backend ZMQ -vv --hosts node1.default.domain
node2.default.domain -n 32 scoopCode.py
to do the same this as this command
python -m scoop.__main__ --backend ZMQ -vv --hosts node1 node2 -n 32
scoopCode.py
What do you see instead?
using long host names I get the following error
ERROR:root:Error while launching SCOOP subprocesses:
ERROR:root:Traceback (most recent call last):
File "/pkg/suse11/python/scoop/0.7.2/lib/python2.7/site-packages/scoop-0.7.2.dev-py2.7.egg/scoop/launcher.py", line 469, in main
rootTaskExitCode = thisScoopApp.run()
File "/pkg/suse11/python/scoop/0.7.2/lib/python2.7/site-packages/scoop-0.7.2.dev-py2.7.egg/scoop/launcher.py", line 258, in run
backend=self.backend,
File "/pkg/suse11/python/scoop/0.7.2/lib/python2.7/site-packages/scoop-0.7.2.dev-py2.7.egg/scoop/launch/brokerLaunch.py", line 148, in __init__
"SSH process stderr:\n{stderr}".format(**locals()))
Exception: Could not successfully launch the remote broker.
Requested remote broker ports, received:
Port number decoding error:
need more than 1 value to unpack
SSH process stderr:
Connection to cl2n091.default.domain closed.
But it runs perfectly fine with only the sort host names
What version of the product are you using?
Python 2.7.5
Scoop version 0.7.2
On what operating system?
SUSE Linux 11
Please provide any additional information below.
I am actually try to run this on out SGI cluster (SGI customized SUSE11), it
uses PBS Pro as the scheduler. If I submit a job with how the hosts line, scoop
detects the hosts PBS has given the job correctly, but it provides the full
hostnames. If I submit a multinode interactive job and manually provide the
short names it works fine, but this is really not ideal as it should be able to
go through the batch system properly.
Original issue reported on code.google.com by [email protected]
on 27 Aug 2014 at 10:22
I have a problem running scoop with OSX El Capitan.
Here the self-explanatory output I get:
import sys
print sys.version
2.7.12 (default, Oct 14 2016, 15:23:34)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)]
from scoop import futures
gaierror Traceback (most recent call last)
in ()
----> 1 from scoop import futures
/Users/username/Homebrew/lib/python2.7/site-packages/scoop/futures.py in ()
24
25 import scoop
---> 26 from ._types import Future, CallbackType
27 from . import _control as control
28 from .fallbacks import (
/Users/username/Homebrew/lib/python2.7/site-packages/scoop/_types.py in ()
21 import greenlet
22 import scoop
---> 23 from scoop._comm import Communicator, Shutdown
24
25 # Backporting collection features
/Users/username/Homebrew/lib/python2.7/site-packages/scoop/_comm/init.py in ()
20
21 if scoop.CONFIGURATION.get('backend', 'ZMQ') == 'ZMQ':
---> 22 from .scoopzmq import ZMQCommunicator as Communicator
23 else:
24 from .scooptcp import TCPCommunicator as Communicator
/Users/username/Homebrew/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py in ()
29
30 import scoop
---> 31 from .. import shared, encapsulation, utils
32 from ..shared import SharedElementEncapsulation
33 from .scoopexceptions import Shutdown, ReferenceBroken
/Users/username/Homebrew/lib/python2.7/site-packages/scoop/shared.py in ()
22 import time
23
---> 24 from . import encapsulation, utils
25 import scoop
26 from .fallbacks import ensureScoopStartedProperly, NotStartedProperly
/Users/username/Homebrew/lib/python2.7/site-packages/scoop/utils.py in ()
41
42 localHostnames.extend([
---> 43 ip for ip in socket.gethostbyname_ex(socket.gethostname())[2]
44 if not ip.startswith("127.")][:1]
45 )
gaierror: [Errno 8] nodename nor servname provided, or not known
Any clue?
Is there any documentation for how to use scoop with SLURM?
One of the main things I'm wondering about is whether to provide a hosts file to scoop or not when running it from SLURM. Does it automatically figure out the hosts and run simulations on them otherwise?
#!/bin/bash
#SBATCH [email protected]
#SBATCH --mail-type=ALL
#SBATCH --nodes=7
#SBATCH --ntasks=72
#SBATCH --time=99:00:00
#SBATCH --mem=10G
#SBATCH --output=python_job_slurm.out
# Which one is correct?
python -m scoop --hostfile hosts.txt my-script.py
python -m scoop my-script.py
and I run it with sbatch python.slurm
What version of the product are you using? On what operating system?
$ python -V
Python 2.7.7
$ lsb_release -a
LSB Version:
:base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd6
4:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 6.4 (Santiago)
Release: 6.4
Codename: Santiago
What steps will reproduce the problem?
1. Download 0.7.1 tarball, open it and cd into the directory
2. Verify that pyzmq is available:
python -c "import zmq; print 'It is there'"
3. Try to install scoop with:
python setup.py install --prefix=$MY_INSTALL_DIR
What is the expected output? What do you see instead?
(....a bunch of irrelevant output....)
error: command 'gcc' failed with exit status 1
Failed with default libzmq, trying again with /usr/local
************************************************
Configure: Autodetecting ZMQ settings...
Custom ZMQ dir: /usr/local
Assembler messages:
Fatal error: can't create
build/temp.linux-x86_64-2.7/scratch/tmp/easy_install-L0czGy/pyzmq-14.3.1/temp/ti
mer_create9prql0.o: No such file or directory
build/temp.linux-x86_64-2.7/scratch/vers.c:4:17: fatal error: zmq.h: No such
file or directory
#include "zmq.h"
^
compilation terminated.
The problem is that zmq.h is in a non-standard location, available to python
but not to scoop.
What is the recommended way of telling scoop where zmq is located?
PS: this is related to issue 8, but that has been closed without anybody being
assigned to it and just dismissing as a pyzmq installation problem (which is
not: both pyzmq and zmq are correctly installed and used by other software on
this system)
Original issue reported on code.google.com by [email protected]
on 15 Jul 2014 at 5:04
While the simulation is running over multiple hosts, if one host goes down, the entire simulation seems to stall. Ideally, all the other simulations should finish running. Only the simulations that were running on the node that went down should be lost.
What steps will reproduce the problem?
1. Install nosetests (1.1.2) and scoop (0.5.3)
2. Just run 'python -m scoop /usr/bin/nosetests' in any directory. (This does
not have to be a project directory.)
3. This is the error message I get:
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/toon/.local/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 86, in <module>
user_module = __import__(os.path.basename(executable)[:-3])
ImportError: No module named nosete
What is the expected output? What do you see instead?
nosetests should end with an error.
What version of the product are you using? On what operating system?
scoop: 0.5.3
nosetests: 1.1.2
python: 2.7.3
OS: linux (3.2.0-32-generic) ubuntu 12.04
Original issue reported on code.google.com by [email protected]
on 15 Nov 2012 at 11:19
Using python 3.5.1 and the latest stable version of SCOOP (i.e. the one available through pip - version 0.7, revision 1.1), any time I run any of the example files, e.g. python -m scoop -vv --hostfile hostfile rssDoc.py, it hangs at the end. I see a message such as "Closing workers on solomon (8 workers)", and nothing more. Debugging through pdb showed me that it was blocking as it tried to read the process.stdout in the close function of workerLaunch.py.
I solved this by making a function to make the stream non-blocking before reading it, and used that on both process.stdout and process.stderr:
import fcntl
import os
def _makeStreamNonBlocking(self, stream):
flags = fcntl.fcntl(stream.fileno(), fcntl.F_GETFL)
fcntl.fcntl(stream.fileno(), fcntl.F_SETFL, flags | os.O_NDELAY)
def close(self):
"""Connection(s) cleanup."""
# Ensure everything is cleaned up on exit
scoop.logger.debug('Closing workers on {0}.'.format(self))
# Output child processes stdout and stderr to console
for process in self.subprocesses:
if process.stdout is not None:
self._makeStreamNonBlocking(process.stdout)
sys.stdout.write(process.stdout.read().decode("utf-8"))
sys.stdout.flush()
if process.stderr is not None:
self._makeStreamNonBlocking(process.stderr)
sys.stderr.write(process.stderr.read().decode("utf-8"))
sys.stderr.flush()
I see this code I modified doesn't even exist in the current version on github. Should I simply favor the github version over the "currently stable" one?
First off, if this isn't an appropriate place to ask a question, just let me know and feel free to delete this issue. I could't find discussion forum anywhere.
Using pythons multiprocessing Pool class, you can specify max tasks per child to create a clean environment for each run of the function. Is there a way to do this in scoop using futures.map?
Thanks!
passing huge lists / iterables into map
or map_as_completed
will first "register" them all for computation and only after it exhausted them all, compute them in parallel.
try running the following with python -m scoop example.py
and notice how nothing is printed for a long time:
#from scoop.futures import map as parallel_map
from scoop.futures import map_as_completed as parallel_map
def square(x):
return x * x
if __name__ == '__main__':
squares = parallel_map(square, range(1000000))
for sq in squares:
print(sq)
I think the reason is in https://github.com/soravux/scoop/blob/master/scoop/futures.py#L94 . In python 3 map
returns an iterable. Even for python 2 it would be cool if the internal function would batch the input.
I'm trying to use scoop in a cluster that uses SLURM. I'm trying to run the example you provide in the documentation (helloworld example). I've run the example in the head node with few cpu's and it works (so it seems installation is correct up to some level at least), but when I run it through sbatch it returns the following error:
EXECUTE PYTHON .PY FILE
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/user/.local/lib/python2.7/site-packages/scoop/main.py", line 21, in
main()
File "/home/user/.local/lib/python2.7/site-packages/scoop/launcher.py", line 454, in main
args.external_hostname = [utils.externalHostname(hosts)]
File "/home/user/.local/lib/python2.7/site-packages/scoop/utils.py", line 101, in externalHostname
hostname = hosts[0][0]
IndexError: list index out of range
END OF JOBS
In the documentation I read scoop is compatible with slurm, is there a particular configuration step that is not documented (the SSH keys are already configured)?
Thanks,
The link for the example startup scripts still points to google code in http://scoop.readthedocs.io/en/0.7/usage.html#use-with-a-scheduler
Is there a way to distribute a Python function using SCOOP from within Python?
The current workflow is to call your python function using python -m scoop ... from the command line.
Can I distribute a particular function in my python script without modifying the command line interface?
I am trying to modify an existing script and I do not want to change the command line interface.
Thanks
in our research group several people share a cluster. as each of them uses different libraries and even different versions of them we each use python virtualenvironments to not have any conflicts.
i'm not entirely sure how to set scoop up so it first activates the virtualenvironment remotely, maybe someone can help. I guess it could be achieved with a special ssh authorized_keys file with a command arg, but it might be quite cumbersome to setup... is there some other option i'm missing?
if i put the following code in test.py
:
import scoop
def boom():
return 0 / 0
if __name__ == '__main__':
boom()
and then run it with python -m scoop test.py
it gives me the following output:
$ python -m scoop test.py
[2015-05-08 19:14:31,741] launcher INFO SCOOP 0.7.1 release on darwin using Python 2.7.9 (default, Jan 7 2015, 11:50:42) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)], API: 1013
[2015-05-08 19:14:31,742] launcher INFO Deploying 8 worker(s) over 1 host(s).
[2015-05-08 19:14:31,742] launcher INFO Worker distribution:
[2015-05-08 19:14:31,742] launcher INFO 127.0.0.1: 7 + origin
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
b.main()
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
futures_startup()
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
run_name="__main__"
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/_control.py", line 253, in runController
raise future.exceptionValue
ZeroDivisionError: integer division or modulo by zero
[2015-05-08 19:14:32,265] launcher (127.0.0.1:50283) INFO Root process is done.
[2015-05-08 19:14:32,266] launcher (127.0.0.1:50283) INFO Finished cleaning spawned subprocesses.
$
As you can see the traceback information was swallowed, making it very hard to understand the errors.
Currently i use the following as a workaround, maybe it might make sense to embed this in scoop?
import sys
import scoop
from functools import wraps
def exception_stack_catcher(func):
@wraps(func)
def exception_stack_wrapper(*args, **kwds):
try:
return func(*args, **kwds)
except Exception as e:
scoop.logger.exception(e)
raise e, None, sys.exc_info()[2]
return exception_stack_wrapper
@exception_stack_catcher
def boom():
return 0 / 0
if __name__ == '__main__':
boom()
Output:
$ python -m scoop test.py
[2015-05-08 20:14:06,534] launcher INFO SCOOP 0.7.1 release on darwin using Python 2.7.9 (default, Jan 7 2015, 11:50:42) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)], API: 1013
[2015-05-08 20:14:06,535] launcher INFO Deploying 8 worker(s) over 1 host(s).
[2015-05-08 20:14:06,535] launcher INFO Worker distribution:
[2015-05-08 20:14:06,535] launcher INFO 127.0.0.1: 7 + origin
[2015-05-08 20:14:06,753] test (127.0.0.1:58913) ERROR integer division or modulo by zero
Traceback (most recent call last):
File "test.py", line 9, in exception_stack_wrapper
return func(*args, **kwds)
File "test.py", line 17, in boom
return 0 / 0
ZeroDivisionError: integer division or modulo by zero
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
b.main()
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
futures_startup()
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
run_name="__main__"
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/Users/joern/venv/lib/python2.7/site-packages/scoop/_control.py", line 253, in runController
raise future.exceptionValue
ZeroDivisionError: integer division or modulo by zero
[2015-05-08 20:14:07,070] launcher (127.0.0.1:51173) INFO Root process is done.
[2015-05-08 20:14:07,071] launcher (127.0.0.1:51173) INFO Finished cleaning spawned subprocesses.
I encounter an error while using scoop:
How to slove it?
I am using:
Python 2.7.5
scoop (0.7.1.1)
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
b.main()
File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
futures_startup()
File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
run_name="__main__"
File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/_control.py", line 253, in runController
raise future.exceptionValue
ValueError: invalid literal for int() with base 10: '7.'
What steps will reproduce the problem?
1. dunno, I get it after several hours of usage
What is the expected output? What do you see instead?
nothing
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 298, in <module>
b.main()
File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 285, in run
futures_startup()
File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 266, in futures_startup
run_name="__main__"
File "/home/ed/projects/equalog/scoop/scoop/futures.py", line 65, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/home/ed/projects/equalog/scoop/scoop/_control.py", line 259, in runController
future = execQueue.pop()
File "/home/ed/projects/equalog/scoop/scoop/_types.py", line 352, in pop
self.updateQueue()
File "/home/ed/projects/equalog/scoop/scoop/_types.py", line 375, in updateQueue
for future in self.socket.recvFuture():
File "/home/ed/projects/equalog/scoop/scoop/_comm/scoopzmq.py", line 353, in recvFuture
received = self._recv()
File "/home/ed/projects/equalog/scoop/scoop/_comm/scoopzmq.py", line 237, in _recv
thisFuture = pickle.loads(msg[1])
cPickle.UnpicklingError: pickle data was truncated
What version of the product are you using? On what operating system?
pip reports 0.7.2.dev, but I am using the latest from hg.
Original issue reported on code.google.com by [email protected]
on 16 Nov 2014 at 7:16
What steps will reproduce the problem?
1. Download 0.7.1 tarball, open it and cd into the directory
2. Try to install it with:
python setup.py install --prefix=$MY_INSTALL_DIR
Tells to use the --zmq option.
3. Using the --zmq option says there is not such an option:
python setup.py install --prefix=$MY_INSTALL_DIR
--zmq=/glade/apps/opt/zeromq/3.2.2/intel/12.1.5/
What is the expected output? What do you see instead?
ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ python
setup.py install --prefix=$MY_INSTALL_DIR
(....a bunch of irrelevant output....)
error: command 'gcc' failed with exit status 1
Failed with default libzmq, trying again with /usr/local
************************************************
Configure: Autodetecting ZMQ settings...
Custom ZMQ dir: /usr/local
Assembler messages:
Fatal error: can't create
build/temp.linux-x86_64-2.7/scratch/tmp/easy_install-L0czGy/pyzmq-14.3.1/temp/ti
mer_create9prql0.o: No such file or directory
build/temp.linux-x86_64-2.7/scratch/vers.c:4:17: fatal error: zmq.h: No such
file or directory
#include "zmq.h"
^
compilation terminated.
error: command 'gcc' failed with exit status 1
************************************************
Warning: Failed to build or run libzmq detection test.
If you expected pyzmq to link against an installed libzmq, please check to make
sure:
* You have a C compiler installed
* A development version of Python is installed (including headers)
* A development version of ZMQ >= 2.1.4 is installed (including headers)
* If ZMQ is not in a default location, supply the argument --zmq=<path>
* If you did recently install ZMQ to a default location,
try rebuilding the ld cache with `sudo ldconfig`
or specify zmq's location with `--zmq=/usr/local`
You can skip all this detection/waiting nonsense if you know
you want pyzmq to bundle libzmq as an extension by passing:
`--zmq=bundled`
I will now try to build libzmq as a Python extension
unless you interrupt me (^C) in the next 10 seconds...
9...error: Setup script exited with interrupted
ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ python
setup.py install --prefix=$MY_INSTALL_DIR
--zmq=/glade/apps/opt/zeromq/3.2.2/intel/12.1.5/
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help
error: option --zmq not recognized
ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ python
setup.py --zmq=/glade/apps/opt/zeromq/3.2.2/intel/12.1.5/ install
--prefix=$MY_INSTALL_DIR
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help
error: option --zmq not recognized
ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ python -c
"import zmq; print 'It is there'"
It is there
ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $
What version of the product are you using? On what operating system?
ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ python -V
Python 2.7.7
ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ lsb_release
-a
LSB Version:
:base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd6
4:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 6.4 (Santiago)
Release: 6.4
Codename: Santiago
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 10 Jul 2014 at 9:55
There is an issue in utils.py
when both SLURM and Python 3 are used. In line 208, subprocess.check_output
returns a bytestring, and then the next line the bytestring is split by a unicode string, resulting in an error. The solution is after line 208 to add the following:
if sys.version_info.major > 2:
hostsstr = hostsstr.decode()
scoopzmq (127.0.0.1:52993) DEBUG 127.0.0.1:52993: Could not send result directly to peer 127.0.0.1:57612, routing through broker.
scoop is a pretty cool lib for easy parallelization, which is why i use it in many of my projects...
However, there are many bugfixes in master that aren't in the latest release 0.7.1.1 and i often find myself "rediscovering" them as 0.7.1.1 is installed and not master...
Would it be possible to have a next release soonish?
it's great that scoop tries to log and return errors to the main process, but it seems to assume that there's no unicode returned in the exception:
# coding: utf-8
from scoop.futures import map as parallel_map
def ex(i):
raise Exception('foo')
if __name__ == '__main__':
for i in range(5):
try:
print list(parallel_map(ex, range(100)))
except:
print 'error handling %d' % i
works as expected:
$ python -m scoop scoop_exceptions.py
[2015-07-02 19:30:35,840] launcher INFO SCOOP 0.7.1 release on darwin using Python 2.7.9 (default, Jan 7 2015, 11:50:42) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)], API: 1013
[2015-07-02 19:30:35,840] launcher INFO Deploying 8 worker(s) over 1 host(s).
[2015-07-02 19:30:35,840] launcher INFO Worker distribution:
[2015-07-02 19:30:35,841] launcher INFO 127.0.0.1: 7 + origin
error handling 0
error handling 1
error handling 2
error handling 3
error handling 4
[2015-07-02 19:30:36,524] launcher (127.0.0.1:50126) INFO Root process is done.
[2015-07-02 19:30:36,524] launcher (127.0.0.1:50126) INFO Finished cleaning spawned subprocesses.
if however i change the 'foo'
to u'jörn'
, my name once more breaks the world:
# coding: utf-8
from scoop.futures import map as parallel_map
def ex(i):
raise Exception(u'jörn')
if __name__ == '__main__':
for i in range(5):
try:
print list(parallel_map(ex, range(100)))
except:
print 'error handling %d' % i
output with 2 processes:
$ python -m scoop -n2 scoop_exceptions.py
[2015-07-02 19:33:45,435] launcher INFO SCOOP 0.7.1 release on darwin using Python 2.7.9 (default, Jan 7 2015, 11:50:42) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)], API: 1013
[2015-07-02 19:33:45,435] launcher INFO Deploying 2 worker(s) over 1 host(s).
[2015-07-02 19:33:45,435] launcher INFO Worker distribution:
[2015-07-02 19:33:45,435] launcher INFO 127.0.0.1: 1 + origin
[2015-07-02 19:33:45,558] scoopzmq (127.0.0.1:65251) ERROR A worker exited unexpectedly. Read the worker logs for more information. SCOOP pool will now shutdown.
error handling 0
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
b.main()
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
futures_startup()
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
run_name="__main__"
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_control.py", line 210, in runController
future = future._switch(future)
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_types.py", line 124, in _switch
return self.greenlet.switch(future)
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_control.py", line 133, in runFuture
tb=traceback.format_exc(),
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
b.main()
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
self.run()
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
futures_startup()
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
run_name="__main__"
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
result = _controller.switch(rootFuture, *args, **kargs)
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_control.py", line 249, in runController
future = future._switch(future)
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_types.py", line 124, in _switch
return self.greenlet.switch(future)
File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_control.py", line 133, in runFuture
tb=traceback.format_exc(),
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)
[2015-07-02 19:33:45,880] launcher (127.0.0.1:50200) INFO Root process is done.
[2015-07-02 19:33:45,880] launcher (127.0.0.1:50200) INFO Finished cleaning spawned subprocesses.
Being able to use locks in the SCOOP framework like multiprocessing locks to synchronize workers would be nice.
This may especially be useful if the submitted jobs involve writing into the same file to prevent corruption of data I/O.
Do you think this feature could be implemented?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.