Comments (11)
Thanks for looking into this! Clearly a bug
I merged #55 for now as that one is easier to reason about, but hope to merge #56 as well
from mode.
Also broken on: 3.8.0, using just the builtin examples, on master:
jbooth@m /home/jbooth/oss/mode (master) $ python examples/tutorial.py
[2020-01-02 10:54:16,276] [16560] [INFO] [^Worker]: Starting...
[2020-01-02 10:54:16,276] [16560] [INFO] [^-App]: Starting...
[2020-01-02 10:54:16,333] [16560] [WARNING] WRITING GRAPH TO image.png
[2020-01-02 10:54:16,354] [16560] [INFO] [^--Websockets]: Starting...
[2020-01-02 10:54:16,354] [16560] [WARNING] STARTING WEBSOCKET SERVER
[2020-01-02 10:54:16,354] [16560] [INFO] [^--Webserver]: Starting...
[2020-01-02 10:54:16,355] [16560] [WARNING] examples/tutorial.py:53: DeprecationWarning: Application.make_handler(...) is deprecated, use AppRunner API instead
handler = self._handler = self._app.make_handler()
[2020-01-02 10:54:16,356] [16560] [INFO] [^--Webserver]: Serving on port 8000
[2020-01-02 10:54:17,357] [16560] [INFO] [^--UserCache]: Starting...
^C-INT- -INT- -INT- -INT- -INT- -INT-
[2020-01-02 10:54:25,363] [16560] [INFO] [^Worker]: Signal received: Signals.SIGINT (2)
[2020-01-02 10:54:25,364] [16560] [INFO] [^Worker]: Stopping...
[2020-01-02 10:54:25,364] [16560] [INFO] [^-App]: Stopping...
[2020-01-02 10:54:25,365] [16560] [INFO] [^--UserCache]: Stopping...
[2020-01-02 10:54:25,366] [16560] [INFO] [^--Webserver]: Stopping...
[2020-01-02 10:54:25,372] [16560] [INFO] [^--Webserver]: Closing server
[2020-01-02 10:54:25,372] [16560] [INFO] [^--Webserver]: Waiting for server to close handle
[2020-01-02 10:54:25,373] [16560] [INFO] [^--Webserver]: Shutting down web application
[2020-01-02 10:54:25,373] [16560] [INFO] [^--Webserver]: Waiting for handler to shut down
[2020-01-02 10:54:25,374] [16560] [INFO] [^--Webserver]: Cleanup
^C-INT- -INT- -INT- -INT- -INT- -INT-
[2020-01-02 10:55:17,589] [16560] [INFO] [^--Websockets]: Stopping...
[2020-01-02 10:55:17,589] [16560] [INFO] [^Worker]: Gathering service tasks...
[2020-01-02 10:55:17,589] [16560] [INFO] [^Worker]: Gathering all futures...
[2020-01-02 10:55:18,591] [16560] [INFO] [^Worker]: Closing event loop
^CException ignored in: <module 'threading' from '/home/jbooth/.asdf/installs/python/3.8.0/lib/python3.8/threading.py'>
Traceback (most recent call last):
File "/home/jbooth/.asdf/installs/python/3.8.0/lib/python3.8/threading.py", line 1388, in _shutdown
lock.acquire()
KeyboardInterrupt:
jbooth@m /home/jbooth/oss/mode (master) $
from mode.
Also broken in python 3.6.9, but it exits fine in python 3.6.3. Aha! Going to trace that down to a single python version.
from mode.
3.6.6 passes. 3.6.7 fails, requiring two ctrl-C's. 3.6.7 passes if I let it sit and idle long enough before the ctrl-C?
3.6.7 is heisenbug. Sometimes ctrl-C exits cleanly. Sometimes not. This is true back to at least 3.6.3.
from mode.
Okay, having walked away in the middle of a test, here's what I've learned: there's two branch points for what looks like failure.
An early ctrl-C will result in a hanging process, for exactly one minute. In general, in the example, if you let it run until it prints the REMOVING EXPIRED USERS
line and quickly ctrl-C you can avoid the first hang. If you hit the first hang it is between these two:
jbooth@m /home/jbooth/oss/mode (master) $ python --version;python examples/tutorial.py
Python 3.7.4
[2020-01-02 12:42:47,116] [10916] [INFO] [^Worker]: Starting...
[2020-01-02 12:42:47,116] [10916] [INFO] [^-App]: Starting...
[2020-01-02 12:42:47,178] [10916] [WARNING] WRITING GRAPH TO image.png
[2020-01-02 12:42:47,198] [10916] [INFO] [^--Websockets]: Starting...
[2020-01-02 12:42:47,198] [10916] [WARNING] STARTING WEBSOCKET SERVER
[2020-01-02 12:42:47,199] [10916] [INFO] [^--Webserver]: Starting...
[2020-01-02 12:42:47,199] [10916] [WARNING] examples/tutorial.py:53: DeprecationWarning: Application.make_handler(...) is deprecated, use AppRunner API instead
handler = self._handler = self._app.make_handler()
[2020-01-02 12:42:47,201] [10916] [INFO] [^--Webserver]: Serving on port 8000
^C[2020-01-02 12:42:47,800] [10916] [INFO] [^--UserCache]: Starting...
-INT- -INT- -INT- -INT- -INT- -INT-
[2020-01-02 12:42:47,801] [10916] [INFO] [^Worker]: Signal received: Signals.SIGINT (2)
[2020-01-02 12:42:47,802] [10916] [INFO] [^Worker]: Stopping...
[2020-01-02 12:42:47,802] [10916] [INFO] [^-App]: Stopping...
[2020-01-02 12:42:47,803] [10916] [INFO] [^--UserCache]: Stopping...
[2020-01-02 12:42:47,804] [10916] [INFO] [^--Webserver]: Stopping...
[2020-01-02 12:42:48,203] [10916] [INFO] [^--Webserver]: Closing server
[2020-01-02 12:42:48,203] [10916] [INFO] [^--Webserver]: Waiting for server to close handle
[2020-01-02 12:42:48,204] [10916] [INFO] [^--Webserver]: Shutting down web application
[2020-01-02 12:42:48,204] [10916] [INFO] [^--Webserver]: Waiting for handler to shut down
[2020-01-02 12:42:48,205] [10916] [INFO] [^--Webserver]: Cleanup
... a minute passes ...
[2020-01-02 12:43:47,865] [10916] [INFO] [^--Websockets]: Stopping...
After that timeout, there's a second failure point, based around the threading changes in 3.7.4. In 3.7.4 and later with those changes, we hang as the interpreter is trying to exit:
[2020-01-02 12:43:47,865] [10916] [INFO] [^--Websockets]: Stopping...
[2020-01-02 12:43:47,866] [10916] [INFO] [^Worker]: Gathering service tasks...
[2020-01-02 12:43:47,866] [10916] [INFO] [^Worker]: Gathering all futures...
[2020-01-02 12:43:48,869] [10916] [INFO] [^Worker]: Closing event loop
... the interpreter hangs (hard) here ...
and then hang what looks like indefinitely (multi-minute) until a second ctrl-C is thrown, at which point the exception in the threading module happens:
[2020-01-02 12:43:47,865] [10916] [INFO] [^--Websockets]: Stopping...
[2020-01-02 12:43:47,866] [10916] [INFO] [^Worker]: Gathering service tasks...
[2020-01-02 12:43:47,866] [10916] [INFO] [^Worker]: Gathering all futures...
[2020-01-02 12:43:48,869] [10916] [INFO] [^Worker]: Closing event loop
^CException ignored in: <module 'threading' from '/home/jbooth/.asdf/installs/python/3.7.4/lib/python3.7/threading.py'>
Traceback (most recent call last):
File "/home/jbooth/.asdf/installs/python/3.7.4/lib/python3.7/threading.py", line 1308, in _shutdown
lock.acquire()
KeyboardInterrupt
jbooth@m /home/jbooth/oss/mode (master) $ date
Thu 02 Jan 2020 12:50:10 PM CST
from mode.
Debugging this is super confusing in part because mode's WorkerThread is overriding threading.Thread's _is_stopped attribute. This means if you look at the thread state in the repl/pdb, it will erroneously report the thread as stopped because self._is_stopped
is a threading.Event which is not falsey. That seems to make WorkerThread.stop
incorrect -- self.is_alive()
is always false, so it'll never execute the self.join()
from mode.
Trying to get better debugging by running the example under python -X dev examples/tutorial.py
I can't even get it launched. Python throws RuntimeErrors about non-thread-safe asyncio loop usage.
from mode.
After fixing the thread safety issues, it tentatively looks like the thread_loop isn't getting properly stopped and closed. When it isn't closed, the thread doesn't properly stop. The unstopped thread prevents interpreter exit. This doesn't fix @r313pp's crashing example (although it also not properly exits post-crash via ctrl-c!), but it does fix the tutorial so it properly exits on ctrl-c. It is causing test failures I haven't yet investigated. I'll try to hit both of those, probably next week, and get a branch and PR up.
diff --git a/mode/threads.py b/mode/threads.py
index 4a5c7ca..635d5c8 100644
--- a/mode/threads.py
+++ b/mode/threads.py
@@ -189,6 +189,9 @@ class ServiceThread(Service):
# shutdown here, since _shutdown_thread will not execute.
self.set_shutdown()
raise
+ finally:
+ self.loop.stop()
+ self.loop.close()
async def stop(self) -> None:
if self._started.is_set():
@@ -208,8 +211,6 @@ class ServiceThread(Service):
await self.on_thread_stop()
self.set_shutdown()
await self._default_stop_futures()
- if self._thread is not None:
- self._thread.stop()
await self._default_stop_exit_stacks()
async def _serve(self) -> None:
diff --git a/mode/utils/futures.py b/mode/utils/futures.py
index 0f8e45a..1e280d5 100644
--- a/mode/utils/futures.py
+++ b/mode/utils/futures.py
@@ -155,7 +155,7 @@ def maybe_set_result(fut: Optional[asyncio.Future],
result: Any) -> bool:
"""Set future result if not already done."""
if fut is not None and not fut.done():
- fut.set_result(result)
+ notify(fut, result)
return True
return False
@@ -165,4 +165,5 @@ def notify(fut: Optional[asyncio.Future], result: Any = None) -> None:
# can be used to turn a Future into a lockless, single-consumer condition,
# for multi-consumer use asyncio.Condition
if fut is not None and not fut.done():
- fut.set_result(result)
+ loop = fut.get_loop()
+ loop.call_soon_threadsafe(fut.set_result, result)
diff --git a/mode/utils/locks.py b/mode/utils/locks.py
index 5c42ee4..eba6e64 100644
--- a/mode/utils/locks.py
+++ b/mode/utils/locks.py
@@ -45,12 +45,13 @@ class Event:
All coroutines waiting for it to become true are awakened.
Coroutine that call wait() once the flag is true will not block at all.
"""
+ from .futures import notify
+
if not self._value:
self._value = True
- for fut in self._waiters:
- if not fut.done():
- fut.set_result(True)
+ for fut in self._waiters.copy():
+ notify(fut, True)
def clear(self) -> None:
"""Reset the internal flag to false.
from mode.
@jbooth-mastery have you tried my PR #55?
Although I haven't addressed correct loop closing problem(may be I'll do it by the end of this week), but threadsafe async calls and thread hang are fixed, at least in my cases.
from mode.
I'd missed your branch. It doesn't seem to work correctly for me:
jbooth@m /home/jbooth/oss/mode (service-thread-hangs-fix) $ python --version;python -X dev -W default examples/tutorial.py
Python 3.8.0
/home/jbooth/oss/mode/.direnv/python-3.8.0/lib/python3.8/site-packages/aiohttp/helpers.py:107: DeprecationWarning: "@coroutine" decorator is deprecated since Python 3.8, use "async def" instead
def noop(*args, **kwargs): # type: ignore
[2020-01-06 09:29:17,035] [13861] [INFO] [^Worker]: Starting...
[2020-01-06 09:29:17,036] [13861] [INFO] [^-App]: Starting...
[2020-01-06 09:29:17,110] [13861] [WARNING] WRITING GRAPH TO image.png
[2020-01-06 09:29:17,136] [13861] [INFO] [^--Websockets]: Starting...
[2020-01-06 09:29:17,136] [13861] [WARNING] STARTING WEBSOCKET SERVER
[2020-01-06 09:29:17,137] [13861] [WARNING] Executing <Task pending name='Task-1' coro=<Service.start() running at /home/jbooth/oss/mode/mode/services.py:732> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fcc01a2d510>()] created at /home/jbooth/oss/mode/mode/threads.py:160> cb=[_run_until_complete_cb() at /home/jbooth/.asdf/installs/python/3.8.0/lib/python3.8/asyncio/base_events.py:180] created at /home/jbooth/oss/mode/mode/worker.py:263> took 0.103 seconds
[2020-01-06 09:29:17,138] [13861] [INFO] [^--Webserver]: Starting...
[2020-01-06 09:29:17,138] [13861] [WARNING] examples/tutorial.py:53: DeprecationWarning: Application.make_handler(...) is deprecated, use AppRunner API instead
handler = self._handler = self._app.make_handler()
[2020-01-06 09:29:17,141] [13861] [INFO] <Server sockets=(<asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET6, type=SocketKind.SOCK_STREAM, proto=6, laddr=('::', 8000, 0, 0)>, <asyncio.TransportSocket fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 8000)>)> is serving
[2020-01-06 09:29:17,141] [13861] [INFO] [^--Webserver]: Serving on port 8000
[2020-01-06 09:29:17,141] [13861] [WARNING] /home/jbooth/oss/mode/mode/services.py:719: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
done, pending = await asyncio.wait(
[2020-01-06 09:29:17,142] [13861] [INFO] [^--UserCache]: Starting...
[2020-01-06 09:29:17,142] [13861] [WARNING] /home/jbooth/oss/mode/mode/services.py:625: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
await asyncio.wait_for(
^C-INT- -INT- -INT- -INT- -INT- -INT-
[2020-01-06 09:29:23,735] [13861] [INFO] [^Worker]: Signal received: Signals.SIGINT (2)
[2020-01-06 09:29:23,736] [13861] [INFO] [^Worker]: Stopping...
[2020-01-06 09:29:23,737] [13861] [INFO] [^-App]: Stopping...
[2020-01-06 09:29:23,738] [13861] [INFO] [^--UserCache]: Stopping...
[2020-01-06 09:29:23,740] [13861] [WARNING] /home/jbooth/oss/mode/mode/services.py:903: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
await asyncio.wait(
[2020-01-06 09:29:23,746] [13861] [INFO] [^--Webserver]: Stopping...
[2020-01-06 09:29:23,747] [13861] [ERROR] [^-App]: Error while stopping child <Webserver: stopping >: RuntimeError('Non-thread-safe operation invoked on an event loop other than the current one')
Traceback (most recent call last):
File "/home/jbooth/oss/mode/mode/services.py", line 854, in _default_stop_children
await asyncio.shield(child.stop())
File "/home/jbooth/oss/mode/mode/threads.py", line 196, in stop
await super().stop()
File "/home/jbooth/oss/mode/mode/services.py", line 832, in stop
self._stopped.set()
File "/home/jbooth/oss/mode/mode/utils/locks.py", line 53, in set
fut.set_result(True)
File "/home/jbooth/.asdf/installs/python/3.8.0/lib/python3.8/asyncio/base_events.py", line 713, in call_soon
self._check_thread()
File "/home/jbooth/.asdf/installs/python/3.8.0/lib/python3.8/asyncio/base_events.py", line 750, in _check_thread
raise RuntimeError(
RuntimeError: Non-thread-safe operation invoked on an event loop other than the current one
[2020-01-06 09:29:23,749] [13861] [INFO] [^--Websockets]: Stopping...
[2020-01-06 09:29:23,751] [13861] [INFO] [^Worker]: Gathering service tasks...
[2020-01-06 09:29:23,751] [13861] [INFO] [^Worker]: Gathering all futures...
/home/jbooth/oss/mode/mode/worker.py:318: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
await asyncio.sleep(1.0, loop=self.loop)
[2020-01-06 09:29:24,753] [13861] [INFO] [^Worker]: Closing event loop
... process hangs ...
I'm going to try to fix the tests I broke and then will look at merging your changes into my branch and will put it up as a PR. I'm hoping to get that all done today or tomorrow as this bug is a blocker for my team.
from mode.
I'll update the conflicts out of it and let's see how it looks?
from mode.
Related Issues (20)
- Missing documentation at readthedocs.org HOT 2
- 'mode.Worker' overrides logging configuration HOT 2
- ServiceProxy and Supervisor doesn't handle _crash_reason
- CancelledError does not cancel 'mode.Worker' HOT 2
- Faust build fails due to import error in utils/logging.py
- Missing import: typing_extensions HOT 4
- add `add_timer` functionality
- Debugging with pdb looks weird
- The loop argument to asyncio.Queue is deprecated since Python 3.8, and scheduled for removal in Python 3.10 HOT 2
- aioeventlet and aiogevent are unmaintained
- Potential MethodQueue bug? HOT 1
- Services should be singletons HOT 1
- asyncio.get_event_loop is not supported by python 3.10 HOT 2
- Q: root/package logger configuration HOT 3
- Is this project abandoned? HOT 1
- ANN: Please migrate away from pytest-openfiles
- Q: multiprocessing/fork HOT 2
- Failed to start example service.py HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mode.