Git Product home page Git Product logo

scoop's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scoop's Issues

scoop locks up if out of memory

I'm running a series of experiments with scoop on a slurm cluster.

Tonight some of my tasks seem to have run out of memory:

Traceback (most recent call last):
  File "/software/python/2.7.12/lib/python2.7/logging/__init__.py", line 872, in emit
Bad address (bundled/zeromq/src/tcp.cpp:244)
    stream.write(ufs % msg)
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/codecs.py", line 706, in write
    return self.writer.write(data)
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/codecs.py", line 370, in write
    self.stream.write(data)
IOError: [Errno 12] Cannot allocate memory
...
Traceback (most recent call last):
  File "/software/python/2.7.12/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/software/python/2.7.12/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
    b.main()
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
    futures_startup()
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
    run_name="__main__"
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_control.py", line 231, in runController
    future = execQueue.pop()
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_types.py", line 320, in pop
    self.updateQueue()
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_types.py", line 343, in updateQueue
    for future in self.socket.recvFuture():
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 279, in recvFuture
    received = self._recv()
  File "/home/hees/graph-pattern-learner/venv/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 188, in _recv
    thisFuture = pickle.loads(msg[1])
IndexError: list index out of range

The main issue here is that it seems as if scoop did not completely terminate, but remains running in a locked up state (0 load) for hours.

Delete shared constants

Hi, it would be nice if constants that were once set via shared.setConst(myconst=42)
could also be deleted, i.e. via a function shared.delConst('myconst') which propagates to all the workers.

This would be useful in case of constants which are huge but which are no longer needed. Deleting them would free memory on all workers.

Problem passing kwargs using futures.map

What steps will reproduce the problem?
1.  Run futures.map with a function with both iterator and keyword arguments
2.  return value from function


What is the expected output? What do you see instead?

should run correctly giving output of program. Instead I get

TypeError: submit() got an unexpected keyword argument 

What version of the product are you using? On what operating system?
0.7.0 RC

Please provide any additional information below.
I've attached a futures.py that seems to have fixed the problem.

Original issue reported on code.google.com by [email protected] on 8 Oct 2013 at 2:35

Attachments:

Scoop not working on OS X 10.9 Python 2.7.5

python -m 'scoop' 

does not start scoop properly. I also get 'Be sure to start your program with 
the '-m scoop' parameter. You can find further information in the 
documentation.' when I actually try to run something using futures.map

(meteng)megatron-5390:examples niko$ python -m 'scoop'
[2014-02-18 14:22:28,739] launcher  INFO    SCOOP 0.7.0 release on darwin using 
Python 2.7.5 (default, Aug 25 2013, 00:04:04) [GCC 4.2.1 Compatible Apple LLVM 
5.0 (clang-500.0.68)], API: 1013
[2014-02-18 14:22:28,739] launcher  INFO    Deploying 4 worker(s) over 1 
host(s).
[2014-02-18 14:22:28,740] launcher  INFO    Worker distribution: 
[2014-02-18 14:22:28,740] launcher  INFO       127.0.0.1:   3 + origin
[2014-02-18 14:22:29,019] __init__  INFO    Launching advertiser...
[2014-02-18 14:22:29,020] __init__  INFO    Advertiser launched.
Exception in thread Thread-3:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 808, in __bootstrap_inner
    self.run()
  File "/Users/niko/.virtualenvs/meteng/lib/python2.7/site-packages/scoop/discovery/minusconf.py", line 279, in run
    self._init_advertiser()
  File "/Users/niko/.virtualenvs/meteng/lib/python2.7/site-packages/scoop/discovery/minusconf.py", line 252, in _init_advertiser
    super(ConcurrentAdvertiser, self)._init_advertiser()
  File "/Users/niko/.virtualenvs/meteng/lib/python2.7/site-packages/scoop/discovery/minusconf.py", line 185, in _init_advertiser
    sock.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_LOOP, struct.pack('@I', 1))
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 22] Invalid argument

and here some pip freeze output in case it help

Flask==0.10.1
Flask-Cache==0.12
Flask-Login==0.2.9
Flask-Migrate==1.2.0
-e 
[email protected]:biosustain/flask-presst.git@58505be0f7dd3b6efdae74eeea81e63e3
106b18f#egg=Flask_Presst-master
Flask-Principal==0.4.0
Flask-RESTful==0.2.10
Flask-Redis==0.0.3
Flask-SQLAlchemy==1.0
Flask-Script==0.6.6
Flask-WTF==0.9.2
Jinja2==2.7.2
Mako==0.9.1
Markdown==2.3.1
MarkupSafe==0.18
Mosek==7.0.90
PdbSublimeTextSupport==0.2
PyDrive==1.0.0
PyYAML==3.10
Pygments==1.6
## !! Could not determine repository location
RESTfulCOBRA==0.1.0
SQLAlchemy==0.9.1
Sphinx==1.2.1
Unidecode==0.04.14
WTForms==1.0.5
Werkzeug==0.9.4
alembic==0.6.2
amqp==1.0.13
aniso8601==0.82
anyjson==0.3.3
argparse==1.2.1
astroid==1.0.1
beautifulsoup4==4.3.2
benchmark==0.1.5
billiard==2.7.3.34
biopython==1.62
blessings==1.5.1
blinker==1.3
bokeh==0.3
cameo==v0.0.0
celery==3.0.24
celery-with-redis==3.0
-e 
[email protected]:phantomas1234/cobrapy.git@37768297c38d99b32429dcff1bed9ebaa21
82de5#egg=cobra-master
columnize==0.3.6
coverage==3.7
cplex==12.5.1.0
cvxopt==1.1.6
dataset==0.4.0
deap==1.0.0rc2
dill==0.2b1
distribute==0.7.3
docutils==0.11
flask-sse==0.1
framed==0.0.0
gdata==2.0.18
gevent==0.13.8
glpk==0.3
google-api-python-client==1.2
greenlet==0.4.1
gunicorn==18.0
gurobipy==5.5.0
honcho==0.5.0
httplib2==0.8
import-relative==0.2.3
inspyred==1.0
ipdb==0.8
ipdbplugin==1.4
ipython==2.0.0-dev
ipython-cluster-helper==0.2.10
iso8601==0.1.8
itsdangerous==0.23
kombu==2.5.16
logilab-common==0.60.1
matplotlib==1.3.1
networkx==1.8.1
nose==1.3.0
nose-progressive==1.5
numpy==1.7.1
numpydoc==0.4
-e 
[email protected]:biosustain/optlang.git@ace6f3ce05acbb52dafb3663c9c54756c50af4
13#egg=optlang-master
pandas==0.13.0
piprot==0.2.0
plotly==0.5.7
ply==3.4
progressbar==2.3
psycopg2==2.5.2
pyDOE==0.3
pydbgr==0.2.6
pyficache==0.2.3
pylint==1.1.0
pymongo==2.6.3
-e 
git+https://github.com/Midnighter/pyorganism.git@88a57182ed382bcd5a508a252d8ddc7
6678d90c1#egg=pyorganism-niko_branch
pyparsing==2.0.1
python-dateutil==2.2
python-ldap==2.4.13
python-memcached==1.53
python-slugify==0.0.7
python-termstyle==0.1.10
pytz==2013.9
pyzmq==13.1.0
radar==0.3
readline==6.2.4.1
redis==2.9.1
rednose==0.4.1
requests==2.2.1
scipy==0.13.0
scoop==0.7.0.release
six==1.5.2
smartypants==1.8.3
sphinx-bootstrap-theme==0.3.6
sphinx-rtd-theme==0.1.5
sse==1.2
sympy==0.7.3
tornado==3.1.1
tracer==0.3.2
-e 
[email protected]:phantomas1234/escher.git@00cbd54c142ffd7dd5e635f721223d94b18c
7282#egg=visbio-master
wsgiref==0.1.2
yaposib==0.3.2


Original issue reported on code.google.com by [email protected] on 18 Feb 2014 at 1:27

Installation fails

Output of pip install scoop:

Collecting scoop
  Could not find a version that satisfies the requirement scoop (from versions: 0.7.0.release, 0.7.1.release)
  Some externally hosted files were ignored as access to them may be unreliable (use --allow-external scoop to allow).
  No matching distribution found for scoop

Python version is 2.7.9

Will scoop create a lot of overhead in clusters

I want to run a simple script in a cluster whose network transmission rate is relatively low. Processes in my script, if created by multiprocess module, have little communication with each other.

Will scoop need a lot of IO over network during execution, which will create a lot of overhead here?

--log option missing in the devel version

What steps will reproduce the problem?
run scoop as a module and --log log.txt

What is the expected output? What do you see instead?
Write all logging to a file. Instead:

python3 -m scoop -n 2 --log log.txt test.py
       [-h] [--hosts [Address [Address ...]] | --hostfile FileName]
       [--path PATH] [--nice NiceLevel] [--verbose] [--quiet]
       [-n NumberOfWorkers] [-b NumberOfBrokers] [--tunnel]
       [--external-hostname Address] [--python-interpreter Path]
       [--pythonpath PYTHONPATH] [--prolog PROLOG] [--profile]
       [--backend {ZMQ,TCP}]
       [executable] ...
python3 -m scoop: error: unrecognized arguments: --log

What version of the product are you using? On what operating system?
0.72 on linux

Please provide any additional information below.
The devel version (0.72) seems to have lost the log file option on the command 
line. This option is still mentioned in the documentation:
http://scoop.readthedocs.org/en/0.7/usage.html



Original issue reported on code.google.com by [email protected] on 25 Mar 2014 at 9:13

Unpickleing error

What steps will reproduce the problem?
1. dunno, I get it after several hours of usage

What is the expected output? What do you see instead?
nothing

Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 298, in <module>
    b.main()
  File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 285, in run
    futures_startup()
  File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 266, in futures_startup
    run_name="__main__"
  File "/home/ed/projects/equalog/scoop/scoop/futures.py", line 65, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/home/ed/projects/equalog/scoop/scoop/_control.py", line 259, in runController
    future = execQueue.pop()
  File "/home/ed/projects/equalog/scoop/scoop/_types.py", line 352, in pop
    self.updateQueue()
  File "/home/ed/projects/equalog/scoop/scoop/_types.py", line 375, in updateQueue
    for future in self.socket.recvFuture():
  File "/home/ed/projects/equalog/scoop/scoop/_comm/scoopzmq.py", line 353, in recvFuture
    received = self._recv()
  File "/home/ed/projects/equalog/scoop/scoop/_comm/scoopzmq.py", line 237, in _recv
    thisFuture = pickle.loads(msg[1])
cPickle.UnpicklingError: pickle data was truncated


What version of the product are you using? On what operating system?
pip reports 0.7.2.dev, but I am using the latest from hg.

Original issue reported on code.google.com by [email protected] on 16 Nov 2014 at 7:16

Cannot import in Mac OS X 10.11.4

Hello,

I apologize if this is due to some misconfiguration from my part (most likely), but I can't seem to be able to import scoop in mac os x. I get the following error

(gen) nettrino$ python
Python 3.5.1 (default, Apr 18 2016, 11:46:32)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

import scoop
Traceback (most recent call last):
File "", line 1, in
File "/Users/nettrino/projects/test/scoop.py", line 1, in
from scoop import futures
ImportError: cannot import name 'futures

Here is my pip packages (I'm in a virtual environment with Python3 but same happens with Python 2)
greenlet (0.4.9)
numpy (1.11.0)
pip (8.1.1)
pyzmq (15.2.0)
scoop (0.7.1.1)
setuptools (20.9.0)
wheel (0.29.0)

nettrino$ python --version
Python 3.5.1

Any hints would be much appreciated

Allocation issue with SLURM

When $SLURM_JOB_NODELIST is e.g. "nodes[006,011]" I get the following error:

File "/python27/lib/python2.7/site-packages/scoop/utils.py", line 209, in parseSLURM
bmin,bmax = rng.split('-')
ValueError: need more than 1 value to unpack

If $SLURM_JOB_NODELIST is e.g. "nodes[021-022]" the workers are deployed over the two hosts.

How to call it from other projects

What i don't understand is using scoop in existing projects.

How can i push data and get data, if i want to call from existing project?

SCOOP cannot be used as an RPC it seems , have to integrate with some other RPC 
Libraries?


Original issue reported on code.google.com by phyo.arkarlwin on 6 Mar 2013 at 3:15

scoop confused by bashrc echo

I have noted the following problem.

I have code in my .bashrc that echo's a status message. When ssh-ing to this machine while dispatching jobs, SCOOP appears to be confused by this message and appears unable to launch jobs on this host effectively. (I get an error message specifically mentioning the bashrc status message. From what I could see, Only one job per node could be launched in such a case. Is there any way to fix this without having to disable the status message? (The message is rather useful when sshing to the machine in other contexts). Upon disabling the output, the error message disappears and the scheduling works fine.

Interpreting process_debug.py info

Hi,
I am trying to interpret the graphics produced by bench/process_debug.py after running my application with the --debug flag. It seems like a useful tool, but I want to interpret the results correctly.

I ran the app on a single machine with 4 CPUs and 4 workers.

The density_debugplot shows that worker 0 has mostly a density of '2' whereas the other workers seem to have a density of '1'.
density_debug

Does a density of 2 on a worker process mean that at a given time 2 tasks are competing for the same resources within a single process? Is it correct to deduce that in this case, worker 0 refers to the root worker that runs both root future as well as a future that consumes tasks?

Next question is about the timeline_debug plot:
timeline_debug

As this plot also shows a metric per worker process as a function of time, I was wondering what the differences are between this plot, and density_debug plot? From the visualization of timeline_debug plot it seems that the workers are far less busy doing stuff than from the density_debugplot. Or should I interpret the blue bars differently here?

Any hints on this?
Thanks!

Can't pickle decorated function

I am using decorated functions within the DEAP framework to limit the tree size and it seems SCOOP does not support pickling decorated functions.

@apply_decorator
def mutUniform(*args, **kwargs):
    return gp.mutUniform(*args, **kwargs)

@apply_decorator
def mutUniform(*args, **kwargs):
    return gp.mutUniform(*args, **kwargs)

toolbox.register("mutate", mutUniform, expr=toolbox.expr_mut, pset=pset)

output:

[2017-02-21 11:47:39,593] scoopzmq  (b'127.0.0.1:57909') WARNING Pickling Error: Can't pickle <function mutUniform at 0x7f8dc2e7a730>: it's not the same object as __main__.mutUniform
scoop._comm.scoopexceptions.ReferenceBroken: This element could not be pickled: FutureId(worker=b'127.0.0.1:57909', rank=1):partial(0,)=None.
[2017-02-21 11:47:42,399] scoopzmq  (b'127.0.0.1:57909') ERROR   A worker exited unexpectedly. Read the worker logs for more information. SCOOP pool will now shutdown.

My code works fine with the built in Python 3 multiprocessing pool.map function btw.

SCOOP crashes depending on network connection.

I have been using SCOOP with DEAP for while now, and I really enjoy how simple it is. I have run into one annoying problem that I am hoping is a simple fix.

When I run a script that uses SCOOP at home (or most other internet connections) it works fine. I was running into trouble when I tried to run the same scripts on my work's internet connection. When I switched to the public internet of the building next door, it began working again. I get the same SCOOP error when I am on my work's internet when I disable my internet connection all together.

I believe I read somewhere that SCOOP uses networking to communicate between workers. I also know that SCOOP has the capability of pinging out to other machines on the local network to see if it is being called as part of a cluster. I assume something in this realm is the source of the issue.

I am hoping there is a simple way to tell SCOOP to only use the cores on the local machine, and not to attempt to reach out across the network, as some connections (or no connections) make that impossible.

If such a workaround does or does not exist, I think it would be nice for the error reported to the user to explain more about what has gone wrong, and perhaps point to specific conditions which SCOOP must be run under.

Or perhaps I am completely wrong about what is happening here. Any help would be much appreciated.

Below is the error that I get when I run a script that uses SCOOP (python -m scoop my_file.py) at work or with my internet connection disabled:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scoop/__main__.py", line 18, in <module>
    from .launcher import main
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scoop/launcher.py", line 31, in <module>
    from scoop import utils

kwargs are not accepted in submit call anymore

My program used to pass kwargs argument to the function using futures.submit  
when I was using version 0.6.2. When I upgrade to 0.7.2 this possibility was no 
longer there. I checked the source code and noticed that most functions stopped 
processing kwargs argument. 

Since my program relied heavily on this feature I re-enabled kwargs support in 
futures , see attached patch. 

Original issue reported on code.google.com by [email protected] on 4 Jun 2014 at 5:31

Attachments:

ImportError with workers on Python 3.6.0

I am try to run with a single host and multiple workers on Python 3.6.0. After the program starts, the main worker keeps running and all the other workers have the error shown below:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 223, in prepare
    _fixup_main_from_name(data['init_main_from_name'])
  File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 249, in _fixup_main_from_name
    alter_sys=True)
  File "C:\ProgramData\Anaconda3\lib\runpy.py", line 201, in run_module
    mod_name, mod_spec, code = _get_module_details(mod_name)
  File "C:\ProgramData\Anaconda3\lib\runpy.py", line 136, in _get_module_details
    raise error("No module named %s" % mod_name)
ImportError: No module named SCOOP_WORKER

When the same code is ran in python 2.7, this ImportError does not appear and the code runs properly.

What is causing this issue to only occur in Python 3.6.0?

What is SCOOP_WORKER?

annoying bug in setup.py

When installing scoop with its setup.py it requests argparse>=1.1 in install_require. That downloads and installs arparse 1.2.1 even when using python 2.7 which already has its own argparse 1.1.

This is a serious problem, because argparse 1.2.1 is old and bugous, and after it is installed, it is used instead of the system one, causing hard-to-track-down troubles such as the one described http://stackoverflow.com/questions/29374044/

So, please change the requirements using a conditional append, e.g.

    if sys.version_info < something:
        install_requires. append('argparse>=1.1')

Moreover, you may want to investigate why that requirement pulls argparse 1.2.1 when on PyPI there is 1.2.2 and 1.3 available.

allow BASE_SSH to be set with ssh compatible wrapper

Dear Yannick,

Would it be possible to set the 'ssh' part of BASE_SSH (class Host, launch/workerLaunch.py) with an environment variable (e.g. SCOOP_SSH) if it's defined?
This is helpful in HPC environments that require the use of a wrapper around ssh (passing all arguments down to ssh).

Thank you!

TypeError: cannot serialize 'greenlet.greenlet' object

For some reason, I'm getting an error because python is trying to serialize greenlets:

Traceback (most recent call last):
  File "/home/apps/Logiciels/Python/python-3.5.1/lib/python3.5/runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/apps/Logiciels/Python/python-3.5.1/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/lochar/python/lib/python3.5/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
    b.main()
  File "/home/lochar/python/lib/python3.5/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/home/lochar/python/lib/python3.5/site-packages/scoop/bootstrap/__main__.py", line 290, in run
    futures_startup()
  File "/home/lochar/python/lib/python3.5/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
    run_name="__main__"
  File "/home/lochar/python/lib/python3.5/site-packages/scoop/futures.py", line 64, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/home/lochar/python/lib/python3.5/site-packages/scoop/_control.py", line 230, in runController
    execQueue.sendResult(future)
  File "/home/lochar/python/lib/python3.5/site-packages/scoop/_types.py", line 384, in sendResult
    self.socket.sendResult(future)
  File "/home/lochar/python/lib/python3.5/site-packages/scoop/_comm/scoopzmq.py", line 319, in sendResult
    pickle.HIGHEST_PROTOCOL,
TypeError: cannot serialize 'greenlet.greenlet' object

I'm not using greenlets, so I'm not sure why I'm getting this. I'm really trying to get started with scoop—I've never used mapreduce nor done parallelized code before, so I might also be making an obvious error. Here is my code:

import os
import re

from scoop import futures as fut

def get_sideeffect_sentences(f):
    sideeffect = re.compile(r"\bside\s+effects?\b")
    tagsplit =  re.compile(r"\s*<[hp]>\s*")
    for ln in open(f):
        for sent in tagsplit.split(ln):
            if sideeffect.search(sent.lower()):
                yield sent

def writeout(x, y):
    print(y, file=outf)

if __name__ == "__main__":
    scratch = os.getenv("SCRATCH")
    outf = open(scratch + "/now/sideeffect_sents.txt", "w")

    lsfiles = [ scratch + "/now/text/" + f for f in os.listdir(scratch + "/now/text")  ]
    fut.mapReduce(get_sideeffect_sentences, writeout, lsfiles)

    outf.close()

The command is "python -m scoop -n 8". I noticed there are Québécois-sounding last names in the docs, so perhaps it might help to say I've been trying that on Briarée (Calcul Québec computer).

Traceback in SCOOP

What steps will reproduce the problem?
1. I have no idea, this is the 2nd time I've seen it though, will running SCOOP 
for about a couple of days

What version of the product are you using? On what operating system?
pip reports 0.7.2.dev, but I am using the latest from hg.

Please provide any additional information below.

  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/ed/projects/equalog/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 298, in <module>
    b.main()
  File "/home/ed/projects/equalog/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/home/ed/projects/equalog/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 285, in run
    futures_startup()
  File "/home/ed/projects/equalog/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 266, in futures_startup
    run_name="__main__"
  File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/futures.py", line 65, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/_control.py", line 259, in runController
    future = execQueue.pop()
  File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/_types.py", line 352, in pop
    self.updateQueue()
  File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/_types.py", line 375, in updateQueue
    for future in self.socket.recvFuture():
  File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 350, in recvFuture
    received = self._recv()
  File "/home/ed/projects/equalog/venv/local/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 251, in _recv
    if thisFuture.sendResultBack:
AttributeError: 'set' object has no attribute 'sendResultBack'


Original issue reported on code.google.com by [email protected] on 14 Nov 2014 at 9:20

socket.gaierror: [Errno -2] Name or service not known

I couldn't use ssh host name.

I checked the parameter to pass the method getaddrinfo.

scoop.BROKER.externalHostname returned thx and scoop.BROKER.task_port returned random int.

How to use other computer through ssh with ~/.ssh/config file? It looks not supported as long as I check the process.

Environment

  • Ubuntu 16.04 x64
  • Python 2.7.12
  • Scoop 0.7.2.0
  • ssh thx is properly working.
thx@thx-Prime:~/workspace/scoop$ python -m scoop --host thx -vv scoop_test.py
[2016-12-21 12:11:44,449] launcher  INFO    SCOOP 0.7 2.0 on linux2 using Python 2.7.12 (default, Oct 21 2016, 22:26:43) [GCC 5.4.0 20160609], API: 1013
[2016-12-21 12:11:44,449] launcher  INFO    Deploying 1 worker(s) over 1 host(s).
[2016-12-21 12:11:44,449] launcher  DEBUG   Using hostname/ip: "thx" as external broker reference.
[2016-12-21 12:11:44,449] launcher  DEBUG   The python executable to execute the program with is: /home/thx/.pyenv/versions/2.7.12/bin/python.
[2016-12-21 12:11:44,449] launcher  INFO    Worker d--istribution:
[2016-12-21 12:11:44,449] launcher  INFO       thx:     0 + origin
[2016-12-21 12:11:44,449] brokerLaunch DEBUG   Launching remote broker: ssh -tt -x -oStrictHostKeyChecking=no -oBatchMode=yes -oUserKnownHostsFile=/dev/null -oServerAliveInterval=300 thx /home/thx/.pyenv/versions/2.7.12/bin/python -m scoop.broker.__main__ --echoGroup --echoPorts --backend ZMQ
[2016-12-21 12:11:44,750] brokerLaunch DEBUG   Foreign broker launched on ports 46544, 45126 of host thx.
                                                                                                         [2016-12-21 12:11:44,750] launcher  DEBUG   Initialising remote origin worker 1 [thx].
                                                                                                                                                                                               [2016-12-21 12:11:44,751] launcher  DEBUG   thx: Launching '/home/thx/.pyenv/versions/2.7.12/bin/python -m scoop.launch.__main__ 1 3 --size 1 --workingDirectory "/home/thx/workspace/scoop" --brokerHostname 127.0.0.1 --externalBrokerHostname thx --taskPort 46544 --metaPort 45126 --origin --backend=ZMQ -vvv scoop_test.py'
                                                                                      Warning: Permanently added '192.168.21.10' (ECDSA) to the list of known hosts.
Launching 1 worker(s) using /bin/bash.
Executing '['/home/thx/.pyenv/versions/2.7.12/bin/python', '-m', 'scoop.bootstrap.__main__', '--size', '1', '--workingDirectory', '/home/thx/workspace/scoop', '--brokerHostname', '127.0.0.1', '--externalBrokerHostname', 'thx', '--taskPort', '46544', '--metaPort', '45126', '--origin', '--backend=ZMQ', '-vvv', 'scoop_test.py']'...
Traceback (most recent call last):
  File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 298, in <module>
    b.main()
  File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 285, in run
    futures_startup()
  File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 266, in futures_startup
    run_name="__main__"
  File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/futures.py", line 65, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/_control.py", line 199, in runController
    execQueue = FutureQueue()
  File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/_types.py", line 264, in __init__
    self.socket = Communicator()
  File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 70, in __init__
    info = socket.getaddrinfo(scoop.BROKER.externalHostname, scoop.BROKER.task_port)[0]
socket.gaierror: [Errno -2] Name or service not known
Exception AttributeError: "'FutureQueue' object has no attribute 'socket'" in <bound method FutureQueue.__del__ of <scoop._types.FutureQueue object at 0x7f5abe703a90>> ignored
Connection to 192.168.21.10 closed.
[2016-12-21 12:11:45,344] launcher  INFO    Root process is done.
                                                                 [2016-12-21 12:11:45,345] workerLaunch DEBUG   Closing workers on thx (1 workers).
                                                                                                                                                   [2016-12-21 12:11:45,345] brokerLaunch DEBUG   Closing broker on host thx.
        Warning: Permanently added '192.168.21.10' (ECDSA) to the list of known hosts.

remote with different user/port

Hi,
is there a way to run with different user and different port?
When I try to give [email protected] -p yyyy, it ends with
'ssh: connect to host xx.xx.xx.xx -p yyyy port 22: Connection timed out\r\n'

And I see the commandlines like setenv PYTHONPATH /home/user1/root/lib/:
which will not obviously work if user1 != user2

maybe there could be some local program (like in jug) that knows the local setting...
Thank you

scoop hanging on wait()

What version of the product are you using? On what operating system?
pip reports 0.7.2.dev, but I am using the latest from hg.

Please provide any additional information below.
Scoop appears to hang for a couple of seconds after doing a 
futures.map_as_completed on this function self.errors = 
self.workers[0].subprocesses[0].wait() in launcher.py.

Is there anyway to speed this up?


Original issue reported on code.google.com by [email protected] on 11 Nov 2014 at 9:47

Get list of hosts used for execution

I'm using scoop to distribute simulation jobs. For that, i need to distribute some files to the used hosts. Is it possible to get a list of currently used hosts in execution? If i use each thread to rsync the files to the local memory, this may end id too much connections to the origin host.

Thanks ;)

how to run a task on each worker?

is there an easy way to run a function on each worker?

I have a scenario where i'd like to trigger writing stats on program termination on each worker. Each of the workers has a local cache etc. and syncing the stats during computation would cause a lot of stupid overhead.

Problem with long hostname resolution

What steps will reproduce the problem?

1. Run a mutli-node scoop run using using full domain names in the --hosts line
e.g.,
python -m scoop.__main__ --backend ZMQ -vv --hosts node1.default.domain 
node2.default.domain -n 32 scoopCode.py 


What is the expected output? 

I would expect this command
python -m scoop.__main__ --backend ZMQ -vv --hosts node1.default.domain 
node2.default.domain -n 32 scoopCode.py

to do the same this as this command
python -m scoop.__main__ --backend ZMQ -vv --hosts node1 node2 -n 32 
scoopCode.py

What do you see instead?

using long host names I get the following error

ERROR:root:Error while launching SCOOP subprocesses:
ERROR:root:Traceback (most recent call last):
  File "/pkg/suse11/python/scoop/0.7.2/lib/python2.7/site-packages/scoop-0.7.2.dev-py2.7.egg/scoop/launcher.py", line 469, in main
    rootTaskExitCode = thisScoopApp.run()
  File "/pkg/suse11/python/scoop/0.7.2/lib/python2.7/site-packages/scoop-0.7.2.dev-py2.7.egg/scoop/launcher.py", line 258, in run
    backend=self.backend,
  File "/pkg/suse11/python/scoop/0.7.2/lib/python2.7/site-packages/scoop-0.7.2.dev-py2.7.egg/scoop/launch/brokerLaunch.py", line 148, in __init__
    "SSH process stderr:\n{stderr}".format(**locals()))
Exception: Could not successfully launch the remote broker.
Requested remote broker ports, received:

Port number decoding error:
need more than 1 value to unpack
SSH process stderr:
Connection to cl2n091.default.domain closed.

But it runs perfectly fine with only the sort host names


What version of the product are you using? 

Python 2.7.5
Scoop version 0.7.2


On what operating system?

SUSE Linux 11 


Please provide any additional information below.

I am actually try to run this on out SGI cluster (SGI customized SUSE11), it 
uses PBS Pro as the scheduler. If I submit a job with how the hosts line, scoop 
detects the hosts PBS has given the job correctly, but it provides the full 
hostnames. If I submit a multinode interactive job and manually provide the 
short names it works fine, but this is really not ideal as it should be able to 
go through the batch system properly. 

Original issue reported on code.google.com by [email protected] on 27 Aug 2014 at 10:22

gaierror: [Errno 8] nodename nor servname provided, or not known

I have a problem running scoop with OSX El Capitan.

Here the self-explanatory output I get:

import sys
print sys.version

2.7.12 (default, Oct 14 2016, 15:23:34)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)]

from scoop import futures

gaierror Traceback (most recent call last)
in ()
----> 1 from scoop import futures

/Users/username/Homebrew/lib/python2.7/site-packages/scoop/futures.py in ()
24
25 import scoop
---> 26 from ._types import Future, CallbackType
27 from . import _control as control
28 from .fallbacks import (

/Users/username/Homebrew/lib/python2.7/site-packages/scoop/_types.py in ()
21 import greenlet
22 import scoop
---> 23 from scoop._comm import Communicator, Shutdown
24
25 # Backporting collection features

/Users/username/Homebrew/lib/python2.7/site-packages/scoop/_comm/init.py in ()
20
21 if scoop.CONFIGURATION.get('backend', 'ZMQ') == 'ZMQ':
---> 22 from .scoopzmq import ZMQCommunicator as Communicator
23 else:
24 from .scooptcp import TCPCommunicator as Communicator

/Users/username/Homebrew/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py in ()
29
30 import scoop
---> 31 from .. import shared, encapsulation, utils
32 from ..shared import SharedElementEncapsulation
33 from .scoopexceptions import Shutdown, ReferenceBroken

/Users/username/Homebrew/lib/python2.7/site-packages/scoop/shared.py in ()
22 import time
23
---> 24 from . import encapsulation, utils
25 import scoop
26 from .fallbacks import ensureScoopStartedProperly, NotStartedProperly

/Users/username/Homebrew/lib/python2.7/site-packages/scoop/utils.py in ()
41
42 localHostnames.extend([
---> 43 ip for ip in socket.gethostbyname_ex(socket.gethostname())[2]
44 if not ip.startswith("127.")][:1]
45 )

gaierror: [Errno 8] nodename nor servname provided, or not known

Any clue?

Using scoop with SLURM

Is there any documentation for how to use scoop with SLURM?

One of the main things I'm wondering about is whether to provide a hosts file to scoop or not when running it from SLURM. Does it automatically figure out the hosts and run simulations on them otherwise?

#!/bin/bash
#SBATCH [email protected]
#SBATCH --mail-type=ALL
#SBATCH --nodes=7
#SBATCH --ntasks=72
#SBATCH --time=99:00:00
#SBATCH --mem=10G
#SBATCH --output=python_job_slurm.out

# Which one is correct?
python -m scoop --hostfile hosts.txt my-script.py
python -m scoop my-script.py

and I run it with sbatch python.slurm

Scoop does not find zmq

What version of the product are you using? On what operating system?

$ python -V
Python 2.7.7

$ lsb_release -a
LSB Version:    
:base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd6
4:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description:    Red Hat Enterprise Linux Server release 6.4 (Santiago)
Release:        6.4
Codename:       Santiago

What steps will reproduce the problem?
1. Download 0.7.1 tarball, open it and cd into the directory 

2. Verify that pyzmq is available:

python -c "import zmq; print 'It is there'"

3. Try to install scoop with:

python setup.py install --prefix=$MY_INSTALL_DIR

What is the expected output? What do you see instead?

(....a bunch of irrelevant output....)

error: command 'gcc' failed with exit status 1

Failed with default libzmq, trying again with /usr/local
************************************************
Configure: Autodetecting ZMQ settings...
    Custom ZMQ dir:       /usr/local
Assembler messages:
Fatal error: can't create 
build/temp.linux-x86_64-2.7/scratch/tmp/easy_install-L0czGy/pyzmq-14.3.1/temp/ti
mer_create9prql0.o: No such file or directory
build/temp.linux-x86_64-2.7/scratch/vers.c:4:17: fatal error: zmq.h: No such 
file or directory
 #include "zmq.h"
                 ^
compilation terminated.

The problem is that zmq.h is in a non-standard location, available to python 
but not to scoop. 

What is the recommended way of telling scoop where zmq is located?

PS: this is related to issue 8, but that has been closed without anybody being 
assigned to it and just dismissing as a pyzmq installation problem (which is 
not: both pyzmq and zmq are correctly installed and used by other software on 
this system)

Original issue reported on code.google.com by [email protected] on 15 Jul 2014 at 5:04

Run stalls if one host goes down

While the simulation is running over multiple hosts, if one host goes down, the entire simulation seems to stall. Ideally, all the other simulations should finish running. Only the simulations that were running on the node that went down should be lost.

Can not use scoop with nosetests

What steps will reproduce the problem?
1. Install nosetests (1.1.2) and scoop (0.5.3)
2. Just run 'python -m scoop /usr/bin/nosetests' in any directory. (This does 
not have to be a project directory.)
3. This is the error message I get:

Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/toon/.local/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 86, in <module>
    user_module = __import__(os.path.basename(executable)[:-3])
ImportError: No module named nosete


What is the expected output? What do you see instead?

nosetests should end with an error.

What version of the product are you using? On what operating system?

scoop: 0.5.3
nosetests: 1.1.2
python: 2.7.3
OS: linux (3.2.0-32-generic) ubuntu 12.04

Original issue reported on code.google.com by [email protected] on 15 Nov 2012 at 11:19

SCOOP hangs on attempting to read remote stdout (on stable version - 0.7, r1.1)

Using python 3.5.1 and the latest stable version of SCOOP (i.e. the one available through pip - version 0.7, revision 1.1), any time I run any of the example files, e.g. python -m scoop -vv --hostfile hostfile rssDoc.py, it hangs at the end. I see a message such as "Closing workers on solomon (8 workers)", and nothing more. Debugging through pdb showed me that it was blocking as it tried to read the process.stdout in the close function of workerLaunch.py.

I solved this by making a function to make the stream non-blocking before reading it, and used that on both process.stdout and process.stderr:

import fcntl
import os

def _makeStreamNonBlocking(self, stream):
    flags = fcntl.fcntl(stream.fileno(), fcntl.F_GETFL)
    fcntl.fcntl(stream.fileno(), fcntl.F_SETFL, flags | os.O_NDELAY)

def close(self):
         """Connection(s) cleanup."""
         # Ensure everything is cleaned up on exit
         scoop.logger.debug('Closing workers on {0}.'.format(self))

         # Output child processes stdout and stderr to console
         for process in self.subprocesses:
             if process.stdout is not None:
                 self._makeStreamNonBlocking(process.stdout)
                 sys.stdout.write(process.stdout.read().decode("utf-8"))
                 sys.stdout.flush()

             if process.stderr is not None:
                 self._makeStreamNonBlocking(process.stderr)
                 sys.stderr.write(process.stderr.read().decode("utf-8"))
                 sys.stderr.flush()


I see this code I modified doesn't even exist in the current version on github. Should I simply favor the github version over the "currently stable" one?

max tasks per child

First off, if this isn't an appropriate place to ask a question, just let me know and feel free to delete this issue. I could't find discussion forum anywhere.

Using pythons multiprocessing Pool class, you can specify max tasks per child to create a clean environment for each run of the function. Is there a way to do this in scoop using futures.map?

Thanks!

map should iterate

passing huge lists / iterables into map or map_as_completed will first "register" them all for computation and only after it exhausted them all, compute them in parallel.

try running the following with python -m scoop example.py and notice how nothing is printed for a long time:

#from scoop.futures import map as parallel_map
from scoop.futures import map_as_completed as parallel_map


def square(x):
    return x * x

if __name__ == '__main__':
    squares = parallel_map(square, range(1000000))
    for sq in squares:
        print(sq)

I think the reason is in https://github.com/soravux/scoop/blob/master/scoop/futures.py#L94 . In python 3 map returns an iterable. Even for python 2 it would be cool if the internal function would batch the input.

Error with SLURM

I'm trying to use scoop in a cluster that uses SLURM. I'm trying to run the example you provide in the documentation (helloworld example). I've run the example in the head node with few cpu's and it works (so it seems installation is correct up to some level at least), but when I run it through sbatch it returns the following error:

EXECUTE PYTHON .PY FILE
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/user/.local/lib/python2.7/site-packages/scoop/main.py", line 21, in
main()
File "/home/user/.local/lib/python2.7/site-packages/scoop/launcher.py", line 454, in main
args.external_hostname = [utils.externalHostname(hosts)]
File "/home/user/.local/lib/python2.7/site-packages/scoop/utils.py", line 101, in externalHostname
hostname = hosts[0][0]
IndexError: list index out of range
END OF JOBS

In the documentation I read scoop is compatible with slurm, is there a particular configuration step that is not documented (the SSH keys are already configured)?

Thanks,

Call SCOOP from within Python

Is there a way to distribute a Python function using SCOOP from within Python?

The current workflow is to call your python function using python -m scoop ... from the command line.

Can I distribute a particular function in my python script without modifying the command line interface?

I am trying to modify an existing script and I do not want to change the command line interface.

Thanks

doc is missing how to remotely run scoop in a virtualenvironment

in our research group several people share a cluster. as each of them uses different libraries and even different versions of them we each use python virtualenvironments to not have any conflicts.

i'm not entirely sure how to set scoop up so it first activates the virtualenvironment remotely, maybe someone can help. I guess it could be achieved with a special ssh authorized_keys file with a command arg, but it might be quite cumbersome to setup... is there some other option i'm missing?

scoop swallows original exception's tracebacks

if i put the following code in test.py:

import scoop

def boom():
    return 0 / 0

if __name__ == '__main__':
    boom()

and then run it with python -m scoop test.py it gives me the following output:

$ python -m scoop test.py
[2015-05-08 19:14:31,741] launcher  INFO    SCOOP 0.7.1 release on darwin using Python 2.7.9 (default, Jan  7 2015, 11:50:42) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)], API: 1013
[2015-05-08 19:14:31,742] launcher  INFO    Deploying 8 worker(s) over 1 host(s).
[2015-05-08 19:14:31,742] launcher  INFO    Worker distribution:
[2015-05-08 19:14:31,742] launcher  INFO       127.0.0.1:   7 + origin
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
    b.main()
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
    futures_startup()
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
    run_name="__main__"
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/_control.py", line 253, in runController
    raise future.exceptionValue
ZeroDivisionError: integer division or modulo by zero
[2015-05-08 19:14:32,265] launcher  (127.0.0.1:50283) INFO    Root process is done.
[2015-05-08 19:14:32,266] launcher  (127.0.0.1:50283) INFO    Finished cleaning spawned subprocesses.
$

As you can see the traceback information was swallowed, making it very hard to understand the errors.

Currently i use the following as a workaround, maybe it might make sense to embed this in scoop?

import sys
import scoop
from functools import wraps

def exception_stack_catcher(func):
    @wraps(func)
    def exception_stack_wrapper(*args, **kwds):
        try:
            return func(*args, **kwds)
        except Exception as e:
            scoop.logger.exception(e)
            raise e, None, sys.exc_info()[2]
    return exception_stack_wrapper

@exception_stack_catcher
def boom():
    return 0 / 0

if __name__ == '__main__':
    boom()

Output:

$ python -m scoop test.py
[2015-05-08 20:14:06,534] launcher  INFO    SCOOP 0.7.1 release on darwin using Python 2.7.9 (default, Jan  7 2015, 11:50:42) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)], API: 1013
[2015-05-08 20:14:06,535] launcher  INFO    Deploying 8 worker(s) over 1 host(s).
[2015-05-08 20:14:06,535] launcher  INFO    Worker distribution:
[2015-05-08 20:14:06,535] launcher  INFO       127.0.0.1:   7 + origin
[2015-05-08 20:14:06,753] test (127.0.0.1:58913) ERROR   integer division or modulo by zero
Traceback (most recent call last):
  File "test.py", line 9, in exception_stack_wrapper
    return func(*args, **kwds)
  File "test.py", line 17, in boom
    return 0 / 0
ZeroDivisionError: integer division or modulo by zero
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
    b.main()
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
    futures_startup()
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
    run_name="__main__"
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/Users/joern/venv/lib/python2.7/site-packages/scoop/_control.py", line 253, in runController
    raise future.exceptionValue
ZeroDivisionError: integer division or modulo by zero
[2015-05-08 20:14:07,070] launcher  (127.0.0.1:51173) INFO    Root process is done.
[2015-05-08 20:14:07,071] launcher  (127.0.0.1:51173) INFO    Finished cleaning spawned subprocesses.

ValueError: invalid literal for int() with base 10: '7.'

I encounter an error while using scoop:
How to slove it?

I am using:
Python 2.7.5
scoop (0.7.1.1)

Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
    b.main()
  File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
    futures_startup()
  File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
    run_name="__main__"
  File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/home/s/shixudon/.local/lib/python2.7/site-packages/scoop/_control.py", line 253, in runController
    raise future.exceptionValue
ValueError: invalid literal for int() with base 10: '7.'

Unpickleing error

What steps will reproduce the problem?
1. dunno, I get it after several hours of usage

What is the expected output? What do you see instead?
nothing

Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 298, in <module>
    b.main()
  File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 285, in run
    futures_startup()
  File "/home/ed/projects/equalog/scoop/scoop/bootstrap/__main__.py", line 266, in futures_startup
    run_name="__main__"
  File "/home/ed/projects/equalog/scoop/scoop/futures.py", line 65, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/home/ed/projects/equalog/scoop/scoop/_control.py", line 259, in runController
    future = execQueue.pop()
  File "/home/ed/projects/equalog/scoop/scoop/_types.py", line 352, in pop
    self.updateQueue()
  File "/home/ed/projects/equalog/scoop/scoop/_types.py", line 375, in updateQueue
    for future in self.socket.recvFuture():
  File "/home/ed/projects/equalog/scoop/scoop/_comm/scoopzmq.py", line 353, in recvFuture
    received = self._recv()
  File "/home/ed/projects/equalog/scoop/scoop/_comm/scoopzmq.py", line 237, in _recv
    thisFuture = pickle.loads(msg[1])
cPickle.UnpicklingError: pickle data was truncated


What version of the product are you using? On what operating system?
pip reports 0.7.2.dev, but I am using the latest from hg.

Original issue reported on code.google.com by [email protected] on 16 Nov 2014 at 7:16

  • Merged into: #14

Specify zmq location on the command line

What steps will reproduce the problem?
1. Download 0.7.1 tarball, open it and cd into the directory 

2. Try to install it with:

python setup.py install --prefix=$MY_INSTALL_DIR

Tells to use the --zmq option. 

3. Using the --zmq option says there is not such an option:

python setup.py install --prefix=$MY_INSTALL_DIR 
--zmq=/glade/apps/opt/zeromq/3.2.2/intel/12.1.5/ 


What is the expected output? What do you see instead?

ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ python 
setup.py install --prefix=$MY_INSTALL_DIR

(....a bunch of irrelevant output....)

error: command 'gcc' failed with exit status 1

Failed with default libzmq, trying again with /usr/local
************************************************
Configure: Autodetecting ZMQ settings...
    Custom ZMQ dir:       /usr/local
Assembler messages:
Fatal error: can't create 
build/temp.linux-x86_64-2.7/scratch/tmp/easy_install-L0czGy/pyzmq-14.3.1/temp/ti
mer_create9prql0.o: No such file or directory
build/temp.linux-x86_64-2.7/scratch/vers.c:4:17: fatal error: zmq.h: No such 
file or directory
 #include "zmq.h"
                 ^
compilation terminated.

error: command 'gcc' failed with exit status 1

************************************************
Warning: Failed to build or run libzmq detection test.

If you expected pyzmq to link against an installed libzmq, please check to make 
sure:

    * You have a C compiler installed
    * A development version of Python is installed (including headers)
    * A development version of ZMQ >= 2.1.4 is installed (including headers)
    * If ZMQ is not in a default location, supply the argument --zmq=<path>
    * If you did recently install ZMQ to a default location,
      try rebuilding the ld cache with `sudo ldconfig`
      or specify zmq's location with `--zmq=/usr/local`

You can skip all this detection/waiting nonsense if you know
you want pyzmq to bundle libzmq as an extension by passing:

    `--zmq=bundled`

I will now try to build libzmq as a Python extension
unless you interrupt me (^C) in the next 10 seconds...

 9...error: Setup script exited with interrupted


ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ python 
setup.py install --prefix=$MY_INSTALL_DIR 
--zmq=/glade/apps/opt/zeromq/3.2.2/intel/12.1.5/ 


usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: setup.py --help [cmd1 cmd2 ...]
   or: setup.py --help-commands
   or: setup.py cmd --help

error: option --zmq not recognized

ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ python 
setup.py --zmq=/glade/apps/opt/zeromq/3.2.2/intel/12.1.5/ install 
--prefix=$MY_INSTALL_DIR                                                        

usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: setup.py --help [cmd1 cmd2 ...]
   or: setup.py --help-commands
   or: setup.py cmd --help

error: option --zmq not recognized


ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ python -c 
"import zmq; print 'It is there'"
It is there
ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ 

What version of the product are you using? On what operating system?
ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ python -V
Python 2.7.7
ddvento@geyser07 /glade/scratch/ddvento/build/scoop-0.7.1.release $ lsb_release 
-a
LSB Version:    
:base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd6
4:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description:    Red Hat Enterprise Linux Server release 6.4 (Santiago)
Release:        6.4
Codename:       Santiago


Please provide any additional information below.


Original issue reported on code.google.com by [email protected] on 10 Jul 2014 at 9:55

Python 3 unicode TypeError in utils.py

There is an issue in utils.py when both SLURM and Python 3 are used. In line 208, subprocess.check_output returns a bytestring, and then the next line the bytestring is split by a unicode string, resulting in an error. The solution is after line 208 to add the following:

if sys.version_info.major > 2:
    hostsstr = hostsstr.decode()

next release?

scoop is a pretty cool lib for easy parallelization, which is why i use it in many of my projects...

However, there are many bugfixes in master that aren't in the latest release 0.7.1.1 and i often find myself "rediscovering" them as 0.7.1.1 is installed and not master...

Would it be possible to have a next release soonish?

unicode exception crashes workers

it's great that scoop tries to log and return errors to the main process, but it seems to assume that there's no unicode returned in the exception:

# coding: utf-8
from scoop.futures import map as parallel_map

def ex(i):
    raise Exception('foo')

if __name__ == '__main__':
    for i in range(5):
        try:
            print list(parallel_map(ex, range(100)))
        except:
            print 'error handling %d' % i

works as expected:

$ python -m scoop scoop_exceptions.py
[2015-07-02 19:30:35,840] launcher  INFO    SCOOP 0.7.1 release on darwin using Python 2.7.9 (default, Jan  7 2015, 11:50:42) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)], API: 1013
[2015-07-02 19:30:35,840] launcher  INFO    Deploying 8 worker(s) over 1 host(s).
[2015-07-02 19:30:35,840] launcher  INFO    Worker distribution:
[2015-07-02 19:30:35,841] launcher  INFO       127.0.0.1:   7 + origin
error handling 0
error handling 1
error handling 2
error handling 3
error handling 4
[2015-07-02 19:30:36,524] launcher  (127.0.0.1:50126) INFO    Root process is done.
[2015-07-02 19:30:36,524] launcher  (127.0.0.1:50126) INFO    Finished cleaning spawned subprocesses.

if however i change the 'foo' to u'jörn', my name once more breaks the world:

# coding: utf-8
from scoop.futures import map as parallel_map

def ex(i):
    raise Exception(u'jörn')


if __name__ == '__main__':
    for i in range(5):
        try:
            print list(parallel_map(ex, range(100)))
        except:
            print 'error handling %d' % i

output with 2 processes:

$ python -m scoop -n2 scoop_exceptions.py
[2015-07-02 19:33:45,435] launcher  INFO    SCOOP 0.7.1 release on darwin using Python 2.7.9 (default, Jan  7 2015, 11:50:42) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)], API: 1013
[2015-07-02 19:33:45,435] launcher  INFO    Deploying 2 worker(s) over 1 host(s).
[2015-07-02 19:33:45,435] launcher  INFO    Worker distribution:
[2015-07-02 19:33:45,435] launcher  INFO       127.0.0.1:   1 + origin
[2015-07-02 19:33:45,558] scoopzmq  (127.0.0.1:65251) ERROR   A worker exited unexpectedly. Read the worker logs for more information. SCOOP pool will now shutdown.
error handling 0
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
    b.main()
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
    futures_startup()
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
    run_name="__main__"
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_control.py", line 210, in runController
    future = future._switch(future)
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_types.py", line 124, in _switch
    return self.greenlet.switch(future)
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_control.py", line 133, in runFuture
    tb=traceback.format_exc(),
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
    b.main()
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
    futures_startup()
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
    run_name="__main__"
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_control.py", line 249, in runController
    future = future._switch(future)
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_types.py", line 124, in _switch
    return self.greenlet.switch(future)
  File "/Users/joern/Desktop/test/venv/lib/python2.7/site-packages/scoop/_control.py", line 133, in runFuture
    tb=traceback.format_exc(),
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)
[2015-07-02 19:33:45,880] launcher  (127.0.0.1:50200) INFO    Root process is done.
[2015-07-02 19:33:45,880] launcher  (127.0.0.1:50200) INFO    Finished cleaning spawned subprocesses.

Locks for Synchronization

Being able to use locks in the SCOOP framework like multiprocessing locks to synchronize workers would be nice.

This may especially be useful if the submitted jobs involve writing into the same file to prevent corruption of data I/O.

Do you think this feature could be implemented?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.