takluyver / jupyter_kernel_mgmt Goto Github PK

View Code? Open in Web Editor NEW

15.0 9.0 8.0 1.53 MB

Experimental new kernel management framework for Jupyter

Home Page: https://jupyter-kernel-mgmt.readthedocs.io/en/latest/

License: Other

Python 100.00%

jupyter_kernel_mgmt's Introduction

Jupyter Kernel management

This is an experimental refactoring of the machinery for launching and using Jupyter kernels.

Some notes on the components, and how they differ from their counterparts in jupyter_client:

KernelClient

Communicate with a kernel over ZeroMQ sockets.

Conceptually quite similar to the KernelClient in jupyter_client, but the implementation differs a bit:

Shutting down a kernel nicely has become a method on the client, since it is primarily about sending a message and waiting for a reply.
The main client class now uses a Tornado IOLoop, and the blocking interface is a wrapper around this. This avoids writing mini event loops which discard any message but the one they're looking for.
Message (de)serialisation and sending/receiving are now part of the separate jupyter_protocol package.

KernelManager

Do 'subproccess-ish' things to a kernel - knowing if it has died, interrupting with a signal, and forceful termination.

Greatly reduced in scope relative to jupyter_client. In particular, the manager is no longer responsible for launching a kernel: that machinery has been separated (see below). The plan is to have parallel async managers, but I haven't really worked this out yet.

The main manager to work with a subprocess is in jupyter_kernel_mgmt.subproc.manager. I have an implementation using the Docker API in my separate jupyter_docker_kernels package. ManagerClient also implements the manager interface (see below).

KernelNanny and ManagerClient

KernelNanny will expose the functionality of a manager using more ZMQ sockets, which I have called nanny_control and nanny_events.

ManagerClient wraps the network communications back into the KernelManager Python interface, so a client can use it as the manager for a remote kernel. It probably needs a better name.

Discovering and launching kernels

A kernel type may be, for instance, Python in a specific conda environment. Each kernel type has an ID, e.g. spec/python3 or ssh/mydesktop.

The plan is that third parties can implement different ways of finding kernel types. They expose a kernel provider, which would know about e.g. conda environments in general and how to find them.

Kernel providers written so far:

Finding kernel specs (in this repository)
Finding ipykernel in the Python we're running on (in this repository)
Explicitly listed kernels available over SSH (https://github.com/takluyver/jupyter_ssh_kernels/)
Explicitly listed kernels in Docker containers (https://github.com/takluyver/jupyter_docker_kernels/)

The common interface to providers is jupyter_kernel_mgmt.discovery.KernelFinder.

To launch a kernel, you pass its type ID to the launch method:

from jupyter_kernel_mgmt.discovery import KernelFinder
kf = KernelFinder.from_entrypoints()
connection_info, manager = kf.launch('spec/python3')

This returns the connection info dict (to be used by a client) and an optional manager object. If manager is None, the connection info should include sockets for a kernel nanny, so ManagerClient can be used. For now, it's possible to have neither.

Automatically restarting kernels

TODO, see issue #1.

jupyter_kernel_mgmt's People

Contributors

Stargazers

Watchers

Forkers

mpacer carreau kevin-bates datalayer-externals golf-player davidbrochart abael-com hbcarlos

jupyter_kernel_mgmt's Issues

Figure out relaunching dead kernels

Leaving a note for myself so I don't forget about it.

[PROPOSAL] Parameterized Kernel Launch

This proposal formalizes the changes that introduced launch parameters by defining kernel launch parameter metadata and how it is to be returned from kernel providers and interpreted by client applications. This feature is known as Parameterized Kernel Launch (a.k.a Parameterized Kernels). It includes 'launch' because many of the parameters really apply to the environment in which the kernel will run and are not actual parameters to the kernel. Things like memory, cpus, and gpus are examples of "environmental" parameters.

I'm using this repository as the primary location because this proposal relies on the Kernel Provider model introduced in this library. That said, the proposal affects other repositories, namely jupyter_server, jupyterlab, and notebook once jupyter_server is adopted as the primary backend server.

Launch Parameter Schema

The set of available launch parameters for a given kernel will be conveyed from the server to the client application via the kernel type information (formerly known as the kernelspec) as JSON returned from the /api/kernelspecs REST endpoint. When available, launch parameter metadata will be included within the existing metadata stanza under launch_parameter_schema, and will consist of JSON schema that describes each available parameter. Because this is pure JSON schema, this information can convey required values, default values, choice lists, etc. and be easily consumed by applications. (Although I'd prefer to avoid this, we could introduce a custom schema if we find the generic schema metadata is not sufficient.)

   "metadata": {
       "launch_parameter_schema": {
         "$schema": "http://json-schema.org/draft-07/schema#",
         "title": "Available parameters for kernel type 'Spark - Scala (Kubernetes)'",
         "properties": {
           "cpus": {"type": "number", "minimum": 0.5, "maximum": 8.0, "default": 4.0, "description": "The number of CPUs to use for this kernel"},
           "memory": {"type": "integer", "minimum": 2, "maximum": 1024, "default": 8, "description": "The number of GB to reserve for memory for this kernel"}
         },
         "required": ["cpus"]
       }
    }

Because the population of the metadata.launch_parameter_schema entry is a function of the provider, how the provider determines what to include is an implementation detail. The requirement is that metadata.launch_parameter_schema contain valid JSON schema. However, since nearly 100% of kernels today are based on kernelspec information located in kernel.json, this proposal will also address how the KernelSpecProvider goes about composing metadata.launch_parameter_schema and acting on the returned parameter values.

KernelSpecProvider Schema Population

I believe we should support two forms of population, referential and embedded.

Referential Schema Population

Referential schema population is intended for launch parameters that are shared across kernel configurations, typically the aforementioned "environmental" parameters. When the KernelSpecProvider loads the kernel.json file, it will look for a key under metadata named launch_parameter_schema_file. If the key exists and its value is an existing file, that file's contents will be loaded into a dictionary object.

Embedded Schema Population

Once the referential population step has taken place, the KernelSpecProvider will check if metadata.launch_parameter_schema exists and contains a value. If so, the KernelSpecProvider will load that value, then update the dictionary resulting from the referential population step. This allows per-kernel parameter information to override the shared parameter information. For example, some kernel types may require more cpus that aren't generally available to all kernel types.

KernelSpecProvider will then use the merged dictionaries from the two population steps as the value for metadata.launch_parameter_schema that is returned from its find_kernels() method and, ultimately, the /api/kernelspecs REST API. Any entry for metadata.launch_parameter_schema_file will not appear in the returned payload.

Client Applications

Parameter-aware applications that retrieve kernel type information from /api/kernelspecs will recognize the existence of any metadata.launch_parameter_schema values. When a kernel type is selected and contains launch parameter schema information, the application should construct a dialog from the schema that prompts for parameter values. Required values should be noted and default values should be pre-filled. (We will need to emphasize that all required values have reasonable defaults, but how that is handled is more a function of the kernel provider.)

Once the application has obtained the desired set of parameters, it will create an entry in the JSON body of the /api/kernels POST request that is a dictionary of name/value pairs. The key under which this set of pairs resides will be named launch_params. The kernels handler will then pass this dictionary to the framework, where the kernel provider launch method will act on it.

   "launch_params": {
       "cpus": 4,
       "memory": 512
    }

Note that applications that are unaware of launch_parameter_schema will still behave in a reasonable manner provided the kernel provider applies reasonable default values to any required parameters.

In addition, it would be beneficial if the set of parameter name/value pairs could be added into the notebook metadata so that subsequent launch attempts could use those values in the pre-filled dialog.

Kernel Provider Launch

Once the kernel provider launch method is called, the provider should validate the parameters and their values against the schema. Any validation errors should result in a failure to launch - although the decision to fail the launch will be a function of the provider. The provider will need to differentiate between "environmental" parameters and actual kernel parameters and apply the values appropriately. jupyter_kernel_mgmt will likely provide a helper method for validation.

Note: Since KernelSpecProvider will be the primary provider, at least initially, applications that wish to take advantage kernel launch parameters may want to create their own providers. Fortunately, we've provided a mechanism whereby KernelSpecProvider can be extended such that much of the discovery and launch machinery can be reused. In these cases, the kernel.json file would need to be prefixed with the new provider id so that KernelSpecProvider doesn't include those same kernel types in its set.

Virtual Kernel Types

One of the advantages of kernel launch parameters is that one could conceivably have a single kernel configured, yet allow for a plethora of configuration options based on the parameter values - as @rgbkrk points out here - since this facility essentially fabricates kernel types that, today, would require a separate type for each set of options.

References

#22
jupyter/jupyter_client#434
jupyter-server/enterprise_gateway#640
https://paper.dropbox.com/doc/Day-1-Kernels-jupyter_client-IPython-Notebook-server--ApyJEjYtqrjfoPg1QpbxZfcpAg-MyS7d8X4wkkhRQy7wClXY
#9

cc (based on inclusion in related threads): @takluyver @SylvainCorlay @Zsailer @lresende @rolweber @jasongrout @blink1073 @echarles @minrk @rgbkrk @MSeal @Carreau

Decide & document required kernel type info

What information should we require kernel providers to give for each kernel_type when .find_kernels() is called? What optional fields should be standardised?

So far, the built in kernel providers offer something like:

{
  "language_info": {"name": "python"},
  "display_name": "Python 3",
  "argv": ["/usr/bin/python3", "-m", "..."],
  "resource_dir": "path/to/resources",
}

display_name and language_info.name are probably minimum requirements.
resource_dir is used by the notebook to display logos and for custom css/js, but it shouldn't be required.
argv may not exist or may not be meaningful for all kernel providers (e.g. if they start processes in a container or on another host.

Async kernel launch fails on Windows

The asynchronous support changes are causing issues on Windows. Regardless of which event loop is used, we wind up encountering different NotImplementedError exceptions. I have focused on Python 3.7 - since we fail even sooner on Python 3.5 and 3.6. Below are the test failures for a single test (test_start_new_kernel) which triggers a launch. All launch-related tests fail with the same issue(s). I can't determine if there are follow-on issues once this issue gets resolved.

event loop: WindowsSelectorEventLoop (default)

When the WindowsSelectorEventLoop is used, either explicitly or by default, the call to asyncio.create_subprocess_exec() fails when calling the internal method _make_subprocess_transport().

____________________________ test_start_new_kernel ____________________________
    async def test_start_new_kernel():
>       km, kc = await start_new_kernel(make_ipkernel_cmd(), startup_timeout=TIMEOUT)
jupyter_kernel_mgmt\tests\test_async_manager.py:27: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
jupyter_kernel_mgmt\subproc\launcher.py:337: in start_new_kernel
    info, km = await launcher.launch()
jupyter_kernel_mgmt\subproc\launcher.py:83: in launch
    kernel = await asyncio.create_subprocess_exec(*args, **kw)
c:\miniconda36-x64\lib\asyncio\subprocess.py:217: in create_subprocess_exec
    stderr=stderr, **kwds)
c:\miniconda36-x64\lib\asyncio\base_events.py:1533: in subprocess_exec
    bufsize, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <_WindowsSelectorEventLoop running=False closed=False debug=False>
protocol = <SubprocessStreamProtocol>
args = ('c:\\miniconda36-x64\\python.exe', '-m', 'ipykernel_launcher', '-f', 'C:\\Users\\appveyor\\AppData\\Roaming\\jupyter\\runtime\\kernel-653c313c-5cd573defee08b306ab6eec6.json')
shell = False, stdin = -1, stdout = None, stderr = None, bufsize = 0
extra = None
kwargs = {'cwd': 'C:\\projects\\jupyter-kernel-mgmt', 'env': {'7ZIP': '"C:\\Program Files\\7-Zip\\7z.exe"', 'ALLUSERSPROFILE': 'C:\\ProgramData', 'APPDATA': 'C:\\Users\\appveyor\\AppData\\Roaming', 'APPVEYOR': 'True', ...}}
    async def _make_subprocess_transport(self, protocol, args, shell,
                                         stdin, stdout, stderr, bufsize,
                                         extra=None, **kwargs):
        """Create subprocess transport."""
>       raise NotImplementedError
E       NotImplementedError
c:\miniconda36-x64\lib\asyncio\base_events.py:463: NotImplementedError

event loop: WindowsProactorEventLoop

Results when WindowsProactorEventLoop is used, by explicitly configuring the event loop policy via asyncio.set_event_loop_policy(asyncio.WindowsProactorEventLoopPolicy()), the launch completes, but we fail to setup the kernel client instance since the instantiation of IOLoopKernelClient fails because the ProactorEventLoop doesn't implement add_reader().

____________________________ test_start_new_kernel ____________________________
    async def test_start_new_kernel():
>       km, kc = await start_new_kernel(make_ipkernel_cmd(), startup_timeout=TIMEOUT)
jupyter_kernel_mgmt\tests\test_async_manager.py:27: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
jupyter_kernel_mgmt\subproc\launcher.py:338: in start_new_kernel
    kc = IOLoopKernelClient(info, manager=km)
jupyter_kernel_mgmt\client.py:51: in __init__
    self.streams[channel] = s = zmqstream.ZMQStream(socket, self.ioloop)
c:\miniconda36-x64\lib\site-packages\zmq\eventloop\zmqstream.py:127: in __init__
    self._init_io_state()
c:\miniconda36-x64\lib\site-packages\zmq\eventloop\zmqstream.py:546: in _init_io_state
    self.io_loop.add_handler(self.socket, self._handle_events, self.io_loop.READ)
c:\miniconda36-x64\lib\site-packages\tornado\platform\asyncio.py:99: in add_handler
    self.asyncio_loop.add_reader(fd, self._handle_events, fd, IOLoop.READ)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <ProactorEventLoop running=False closed=False debug=False>, fd = 800
callback = <bound method BaseAsyncIOLoop._handle_events of <tornado.platform.asyncio.AsyncIOMainLoop object at 0x000000D7B3FF4780>>
args = (800, 1)
    def add_reader(self, fd, callback, *args):
>       raise NotImplementedError
E       NotImplementedError
c:\miniconda36-x64\lib\asyncio\events.py:505: NotImplementedError

Since we know kernels can be launched synchronously using the standard subprocess module on Windows and can visually confirm the WindowsSelectorEventLoop implements add_reader() - which is also backed by the fact that sync kernel launches work, its probably in our best interest to try to use the standard subprocess module for kernel launches with WindowsSelectorEventLoop.

Also, because asyncio switches the default event loop to WindowsProactorEventLoop in python 3.8, we'll want to go ahead and "pin" the selector event loop until a better solution is found.

Create KernelTypeApp application

We need to create an equivalent to KernelSpecApp. We should call it KernelTypeApp for two reasons:

It will be creating Kernel Type definitions for the KernelSpec provider. So while it's creating kernel specifications, they will be cognizant of Kernel Types and any of the features that the new kernel management machinery might produce.
We need to be able to coexist with jupyter_client applications on the same installation and KernelSpecApp is jupyter_client's tool. If we shared the name, then the console scripts would collide and last (or first, not sure with python deployments) writer would win.

Test suite failures

Hi,

I've trying to update this package to 0.5.1 in Guix, and I have 8 failures that I'm having some difficulty to interpret/understand:

============================= test session starts ==============================
platform linux -- Python 3.9.9, pytest-6.2.5, py-1.10.0, pluggy-0.13.1 -- /gnu/store/j3cx0yaqdpw0mxizp5bayx93pya44dhn-python-wrapper-3.9.9/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/guix-build-python-jupyter-kernel-mgmt-0.5.1.drv-0/jupyter_kernel_mgmt-0.5.1/.hypothesis/examples')
rootdir: /tmp/guix-build-python-jupyter-kernel-mgmt-0.5.1.drv-0/jupyter_kernel_mgmt-0.5.1
plugins: hypothesis-6.0.2, asyncio-0.17.2
asyncio: mode=legacy
collecting ... collected 38 items

jupyter_kernel_mgmt/tests/test_async_manager.py::test_get_connect_info PASSED [  2%]
jupyter_kernel_mgmt/tests/test_client_blocking.py::test_execute_interactive PASSED [  5%]
jupyter_kernel_mgmt/tests/test_client_blocking.py::test_history PASSED   [  7%]
jupyter_kernel_mgmt/tests/test_client_blocking.py::test_inspect PASSED   [ 10%]
jupyter_kernel_mgmt/tests/test_client_blocking.py::test_complete PASSED  [ 13%]
jupyter_kernel_mgmt/tests/test_client_blocking.py::test_kernel_info PASSED [ 15%]
jupyter_kernel_mgmt/tests/test_client_blocking.py::test_comm_info PASSED [ 18%]
jupyter_kernel_mgmt/tests/test_client_blocking.py::test_shutdown PASSED  [ 21%]
jupyter_kernel_mgmt/tests/test_client_blocking.py::test_shutdown ERROR   [ 21%]
jupyter_kernel_mgmt/tests/test_client_loop.py::test_history FAILED       [ 23%]
jupyter_kernel_mgmt/tests/test_client_loop.py::test_inspect FAILED       [ 26%]
jupyter_kernel_mgmt/tests/test_client_loop.py::test_complete FAILED      [ 28%]
jupyter_kernel_mgmt/tests/test_client_loop.py::test_kernel_info FAILED   [ 31%]
jupyter_kernel_mgmt/tests/test_client_loop.py::test_comm_info FAILED     [ 34%]
jupyter_kernel_mgmt/tests/test_client_loop.py::test_shutdown FAILED      [ 36%]
jupyter_kernel_mgmt/tests/test_discovery.py::test_ipykernel_provider PASSED [ 39%]
jupyter_kernel_mgmt/tests/test_discovery.py::test_meta_kernel_finder PASSED [ 42%]
jupyter_kernel_mgmt/tests/test_discovery.py::test_kernel_spec_provider PASSED [ 44%]
jupyter_kernel_mgmt/tests/test_discovery.py::test_kernel_spec_provider_subclass PASSED [ 47%]
jupyter_kernel_mgmt/tests/test_discovery.py::test_kernel_launch_params PASSED [ 50%]
jupyter_kernel_mgmt/tests/test_discovery.py::test_load_config FAILED     [ 52%]
jupyter_kernel_mgmt/tests/test_discovery.py::test_discovery_main PASSED  [ 55%]
jupyter_kernel_mgmt/tests/test_kernelapp.py::test_kernelapp_lifecycle PASSED [ 57%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_find_kernel_specs PASSED [ 60%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_get_kernel_spec PASSED [ 63%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_find_all_specs PASSED [ 65%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_kernel_spec_priority PASSED [ 68%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_install_kernel_spec PASSED [ 71%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_install_kernel_spec_prefix PASSED [ 73%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_cant_install_kernel_spec PASSED [ 76%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_remove_kernel_spec PASSED [ 78%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_remove_kernel_spec_app PASSED [ 81%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_validate_kernel_name PASSED [ 84%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_provider_find_kernel_specs PASSED [ 86%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_provider_get_kernel_spec PASSED [ 89%]
jupyter_kernel_mgmt/tests/test_kernelspec.py::test_provider_find_all_specs PASSED [ 92%]
jupyter_kernel_mgmt/tests/test_localinterfaces.py::test_load_ips PASSED  [ 94%]
jupyter_kernel_mgmt/tests/test_manager.py::test_signal_kernel_subprocesses FAILED [ 97%]
jupyter_kernel_mgmt/tests/test_restarter.py::test_reinstantiate PASSED   [100%]

==================================== ERRORS ====================================
______________________ ERROR at teardown of test_shutdown ______________________

self = <jupyter_kernel_mgmt.client.IOLoopKernelClient object at 0x7ffff3aaa460>
timeout = 5.0

    @gen.coroutine
    def shutdown_or_terminate(self, timeout=5.0):
        """Ask the kernel to shut down, and terminate it if it takes too long.
    
        The kernel will be given up to timeout seconds to respond to the
        shutdown message, then the same timeout to terminate.
        """
        if not self.manager:
            raise RuntimeError(
                "Cannot terminate a kernel without a KernelManager")
        try:
>           yield gen.with_timeout(timedelta(seconds=timeout), self.shutdown())

jupyter_kernel_mgmt/client.py:286: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <tornado.gen.Runner object at 0x7ffff2a335b0>

    def run(self):
        """Starts or resumes the generator, running until it reaches a
        yield point that is not ready.
        """
        if self.running or self.finished:
            return
        try:
            self.running = True
            while True:
                future = self.future
                if not future.done():
                    return
                self.future = None
                try:
                    orig_stack_contexts = stack_context._state.contexts
                    exc_info = None
    
                    try:
>                       value = future.result()
E                       tornado.util.TimeoutError: Timeout

/gnu/store/3i7wwfpmlvr8f6158cgjc8x3y4ybhj3q-python-tornado-5.1.1/lib/python3.9/site-packages/tornado/gen.py:1133: TimeoutError

During handling of the above exception, another exception occurred:

setup_env = None

    @pytest.fixture
    def kernel_client(setup_env):
        # Instantiate KernelFinder directly, so tests aren't affected by entrypoints
        # from other installed packages
        finder = KernelFinder([IPykernelProvider()])
        with run_kernel_blocking('pyimport/kernel', finder=finder) as kc:
>           yield kc

jupyter_kernel_mgmt/tests/test_client_blocking.py:21: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/gnu/store/b6j1qw1a5rkbfvcy7lc9fm95abbzpa4x-python-3.9.9/lib/python3.9/contextlib.py:126: in __exit__
    next(self.gen)
jupyter_kernel_mgmt/hl.py:78: in run_kernel_blocking
    kc.shutdown_or_terminate()
jupyter_kernel_mgmt/client.py:347: in wrapped
    return loop.run_sync(lambda: method(self.loop_client, *args, **kwargs),
/gnu/store/3i7wwfpmlvr8f6158cgjc8x3y4ybhj3q-python-tornado-5.1.1/lib/python3.9/site-packages/tornado/ioloop.py:576: in run_sync
    return future_cell[0].result()
/gnu/store/3i7wwfpmlvr8f6158cgjc8x3y4ybhj3q-python-tornado-5.1.1/lib/python3.9/site-packages/tornado/gen.py:1141: in run
    yielded = self.gen.throw(*exc_info)
jupyter_kernel_mgmt/client.py:290: in shutdown_or_terminate
    yield self.manager.kill()
/gnu/store/3i7wwfpmlvr8f6158cgjc8x3y4ybhj3q-python-tornado-5.1.1/lib/python3.9/site-packages/tornado/gen.py:1133: in run
    value = future.result()
jupyter_kernel_mgmt/subproc/manager.py:83: in kill
    self.kernel.kill()
/gnu/store/b6j1qw1a5rkbfvcy7lc9fm95abbzpa4x-python-3.9.9/lib/python3.9/asyncio/subprocess.py:144: in kill
    self._transport.kill()
/gnu/store/b6j1qw1a5rkbfvcy7lc9fm95abbzpa4x-python-3.9.9/lib/python3.9/asyncio/base_subprocess.py:153: in kill
    self._check_proc()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <_UnixSubprocessTransport closed pid=138 returncode=0 stdin=<_UnixWritePipeTransport closed fd=23 closed>>

    def _check_proc(self):
        if self._proc is None:
>           raise ProcessLookupError()
E           ProcessLookupError

/gnu/store/b6j1qw1a5rkbfvcy7lc9fm95abbzpa4x-python-3.9.9/lib/python3.9/asyncio/base_subprocess.py:142: ProcessLookupError
=================================== FAILURES ===================================
_________________________________ test_history _________________________________

kernel_client = <async_generator._impl.AsyncGenerator object at 0x7ffff2740040>

    async def test_history(kernel_client):
>       reply = await kernel_client.history(session=0)
E       AttributeError: 'AsyncGenerator' object has no attribute 'history'

jupyter_kernel_mgmt/tests/test_client_loop.py:35: AttributeError
_________________________________ test_inspect _________________________________

kernel_client = <async_generator._impl.AsyncGenerator object at 0x7ffff29b5850>

    async def test_inspect(kernel_client):
>       reply = await kernel_client.inspect('who cares')
E       AttributeError: 'AsyncGenerator' object has no attribute 'inspect'

jupyter_kernel_mgmt/tests/test_client_loop.py:40: AttributeError
________________________________ test_complete _________________________________

kernel_client = <async_generator._impl.AsyncGenerator object at 0x7ffff273f460>

    async def test_complete(kernel_client):
>       reply = await kernel_client.complete('who cares')
E       AttributeError: 'AsyncGenerator' object has no attribute 'complete'

jupyter_kernel_mgmt/tests/test_client_loop.py:45: AttributeError
_______________________________ test_kernel_info _______________________________

kernel_client = <async_generator._impl.AsyncGenerator object at 0x7ffff29b6880>

    async def test_kernel_info(kernel_client):
>       reply = await kernel_client.kernel_info()
E       AttributeError: 'AsyncGenerator' object has no attribute 'kernel_info'

jupyter_kernel_mgmt/tests/test_client_loop.py:50: AttributeError
________________________________ test_comm_info ________________________________

kernel_client = <async_generator._impl.AsyncGenerator object at 0x7ffff27404c0>

    async def test_comm_info(kernel_client):
>       reply = await kernel_client.comm_info()
E       AttributeError: 'AsyncGenerator' object has no attribute 'comm_info'

jupyter_kernel_mgmt/tests/test_client_loop.py:55: AttributeError
________________________________ test_shutdown _________________________________

kernel_client = <async_generator._impl.AsyncGenerator object at 0x7ffff29b5820>

    async def test_shutdown(kernel_client):
>       reply = await kernel_client.shutdown()
E       AttributeError: 'AsyncGenerator' object has no attribute 'shutdown'

jupyter_kernel_mgmt/tests/test_client_loop.py:60: AttributeError
_______________________________ test_load_config _______________________________

setup_test = None

    async def test_load_config(setup_test):
        # create fake application
        app = ProviderApplication()
        app.launch_instance(argv=["--ProviderConfig.my_argv=['xxx','yyy']"])
    
        kf = discovery.KernelFinder(providers=[TestConfigKernelProvider()])
        dummy_kspecs = list(kf.find_kernels())
    
        count = 0
        found_argv = []
        for name, spec in dummy_kspecs:
            if name == 'config/sample':
                found_argv = spec['argv']
                count += 1
    
        assert count == 1
>       assert found_argv == ['xxx', 'yyy']
E       assert ["['xxx','yyy']"] == ['xxx', 'yyy']
E         At index 0 diff: "['xxx','yyy']" != 'xxx'
E         Right contains one more item: 'yyy'
E         Full diff:
E         - ['xxx', 'yyy']
E         ?        -
E         + ["['xxx','yyy']"]
E         ? ++             ++

jupyter_kernel_mgmt/tests/test_discovery.py:277: AssertionError
_______________________ test_signal_kernel_subprocesses ________________________

signal_kernel_client = <async_generator._impl.AsyncGenerator object at 0x7ffff2998eb0>

    @pytest.mark.skipif(sys.platform.startswith('win'), reason="Windows")
    @pytest.mark.asyncio
    async def test_signal_kernel_subprocesses(signal_kernel_client):
        kc = signal_kernel_client
        async def execute(cmd):
            reply = await kc.execute(cmd)
            content = reply.content
            assert content['status'] == 'ok'
            return content
    
        N = 5
        for i in range(N):
>           await execute("start")

jupyter_kernel_mgmt/tests/test_manager.py:49: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cmd = 'start'

    async def execute(cmd):
>       reply = await kc.execute(cmd)
E       AttributeError: 'AsyncGenerator' object has no attribute 'execute'

jupyter_kernel_mgmt/tests/test_manager.py:42: AttributeError
=============================== warnings summary ===============================
jupyter_kernel_mgmt/discovery.py:26
  /tmp/guix-build-python-jupyter-kernel-mgmt-0.5.1.drv-0/jupyter_kernel_mgmt-0.5.1/jupyter_kernel_mgmt/discovery.py:26: DeprecationWarning: "@coroutine" decorator is deprecated since Python 3.8, use "async def" instead
    def find_kernels(self):

jupyter_kernel_mgmt/discovery.py:72
  /tmp/guix-build-python-jupyter-kernel-mgmt-0.5.1.drv-0/jupyter_kernel_mgmt-0.5.1/jupyter_kernel_mgmt/discovery.py:72: DeprecationWarning: "@coroutine" decorator is deprecated since Python 3.8, use "async def" instead
    def find_kernels(self):

jupyter_kernel_mgmt/discovery.py:116
  /tmp/guix-build-python-jupyter-kernel-mgmt-0.5.1.drv-0/jupyter_kernel_mgmt-0.5.1/jupyter_kernel_mgmt/discovery.py:116: DeprecationWarning: "@coroutine" decorator is deprecated since Python 3.8, use "async def" instead
    def find_kernels(self):

jupyter_kernel_mgmt/discovery.py:174
  /tmp/guix-build-python-jupyter-kernel-mgmt-0.5.1.drv-0/jupyter_kernel_mgmt-0.5.1/jupyter_kernel_mgmt/discovery.py:174: DeprecationWarning: "@coroutine" decorator is deprecated since Python 3.8, use "async def" instead
    def find_kernels(self):

../../../gnu/store/0q70jn8vcyvs40z0bcll80zj3aacsw0x-python-pytest-asyncio-0.17.2/lib/python3.9/site-packages/pytest_asyncio/plugin.py:191
  /gnu/store/0q70jn8vcyvs40z0bcll80zj3aacsw0x-python-pytest-asyncio-0.17.2/lib/python3.9/site-packages/pytest_asyncio/plugin.py:191: DeprecationWarning: The 'asyncio_mode' default value will change to 'strict' in future, please explicitly use 'asyncio_mode=strict' or 'asyncio_mode=auto' in pytest configuration file.
    config.issue_config_time_warning(LEGACY_MODE, stacklevel=2)

jupyter_kernel_mgmt/tests/test_discovery.py:29
  /tmp/guix-build-python-jupyter-kernel-mgmt-0.5.1.drv-0/jupyter_kernel_mgmt-0.5.1/jupyter_kernel_mgmt/tests/test_discovery.py:29: DeprecationWarning: "@coroutine" decorator is deprecated since Python 3.8, use "async def" instead
    def find_kernels(self):

jupyter_kernel_mgmt/tests/test_discovery.py:97
  /tmp/guix-build-python-jupyter-kernel-mgmt-0.5.1.drv-0/jupyter_kernel_mgmt-0.5.1/jupyter_kernel_mgmt/tests/test_discovery.py:97: DeprecationWarning: "@coroutine" decorator is deprecated since Python 3.8, use "async def" instead
    def find_kernels(self):

jupyter_kernel_mgmt/tests/test_client_blocking.py::test_execute_interactive
  /gnu/store/0q70jn8vcyvs40z0bcll80zj3aacsw0x-python-pytest-asyncio-0.17.2/lib/python3.9/site-packages/pytest_asyncio/plugin.py:317: DeprecationWarning: '@pytest.fixture' is applied to <fixture monkeypatch, file=/gnu/store/rj9gyaqi2ijkll5495jgp7kbfqh02167-python-pytest-6.2.5/lib/python3.9/site-packages/_pytest/monkeypatch.py, line=29> in 'legacy' mode, please replace it with '@pytest_asyncio.fixture' as a preparation for switching to 'strict' mode (or use 'auto' mode to seamlessly handle all these fixtures as asyncio-driven).
    warnings.warn(

jupyter_kernel_mgmt/tests/test_client_blocking.py::test_execute_interactive
  /gnu/store/0q70jn8vcyvs40z0bcll80zj3aacsw0x-python-pytest-asyncio-0.17.2/lib/python3.9/site-packages/pytest_asyncio/plugin.py:317: DeprecationWarning: '@pytest.fixture' is applied to <fixture kernel_client, file=/tmp/guix-build-python-jupyter-kernel-mgmt-0.5.1.drv-0/jupyter_kernel_mgmt-0.5.1/jupyter_kernel_mgmt/tests/test_client_blocking.py, line=15> in 'legacy' mode, please replace it with '@pytest_asyncio.fixture' as a preparation for switching to 'strict' mode (or use 'auto' mode to seamlessly handle all these fixtures as asyncio-driven).
    warnings.warn(

jupyter_kernel_mgmt/tests/test_discovery.py::test_kernel_launch_params
  /gnu/store/0q70jn8vcyvs40z0bcll80zj3aacsw0x-python-pytest-asyncio-0.17.2/lib/python3.9/site-packages/pytest_asyncio/plugin.py:317: DeprecationWarning: '@pytest.fixture' is applied to <fixture caplog, file=/gnu/store/rj9gyaqi2ijkll5495jgp7kbfqh02167-python-pytest-6.2.5/lib/python3.9/site-packages/_pytest/logging.py, line=475> in 'legacy' mode, please replace it with '@pytest_asyncio.fixture' as a preparation for switching to 'strict' mode (or use 'auto' mode to seamlessly handle all these fixtures as asyncio-driven).
    warnings.warn(

jupyter_kernel_mgmt/tests/test_kernelspec.py::test_install_kernel_spec
  /gnu/store/0q70jn8vcyvs40z0bcll80zj3aacsw0x-python-pytest-asyncio-0.17.2/lib/python3.9/site-packages/pytest_asyncio/plugin.py:317: DeprecationWarning: '@pytest.fixture' is applied to <fixture installable_kernel, file=/tmp/guix-build-python-jupyter-kernel-mgmt-0.5.1.drv-0/jupyter_kernel_mgmt-0.5.1/jupyter_kernel_mgmt/tests/test_kernelspec.py, line=56> in 'legacy' mode, please replace it with '@pytest_asyncio.fixture' as a preparation for switching to 'strict' mode (or use 'auto' mode to seamlessly handle all these fixtures as asyncio-driven).
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/warnings.html
=========================== short test summary info ============================
FAILED jupyter_kernel_mgmt/tests/test_client_loop.py::test_history - Attribut...
FAILED jupyter_kernel_mgmt/tests/test_client_loop.py::test_inspect - Attribut...
FAILED jupyter_kernel_mgmt/tests/test_client_loop.py::test_complete - Attribu...
FAILED jupyter_kernel_mgmt/tests/test_client_loop.py::test_kernel_info - Attr...
FAILED jupyter_kernel_mgmt/tests/test_client_loop.py::test_comm_info - Attrib...
FAILED jupyter_kernel_mgmt/tests/test_client_loop.py::test_shutdown - Attribu...
FAILED jupyter_kernel_mgmt/tests/test_discovery.py::test_load_config - assert...
FAILED jupyter_kernel_mgmt/tests/test_manager.py::test_signal_kernel_subprocesses
ERROR jupyter_kernel_mgmt/tests/test_client_blocking.py::test_shutdown - Proc...
============= 8 failed, 30 passed, 11 warnings, 1 error in 12.93s ==============

Any idea what it may have to do with? I'm currently testing with the following direct dependencies (and Python 3.9.9):

$ ./pre-inst-env guix show python-jupyter-kernel-mgmt
name: python-jupyter-kernel-mgmt
version: 0.5.1
outputs: out
systems: x86_64-linux
dependencies: [email protected] [email protected] [email protected]
+ [email protected] [email protected] [email protected] [email protected]
+ [email protected] [email protected] [email protected] [email protected]
+ [email protected]
homepage: https://jupyter.org
license: Modified BSD
synopsis: Discover, launch, and communicate with Jupyter kernels  
description: This package is an experimental refactoring of the machinery for launching and using
+ Jupyter kernels.

jupyter_kernel_mgmt.discovery main returns Error

python -m jupyter_kernel_mgmt.discovery returns the following stacktrace

  File "/opt/datalayer/opt/miniconda3/envs/datalayer/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/datalayer/opt/miniconda3/envs/datalayer/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/echar4/datalayer/repos/jupyter-kernel-mgmt/jupyter_kernel_mgmt/discovery.py", line 204, in <module>
    main()
  File "/Users/echar4/datalayer/repos/jupyter-kernel-mgmt/jupyter_kernel_mgmt/discovery.py", line 198, in main
    found_kernels = run_sync(kf.find_kernels())
  File "/Users/echar4/datalayer/repos/jupyter-kernel-mgmt/jupyter_kernel_mgmt/util.py", line 18, in run_sync
    return asyncio.get_event_loop().run_until_complete(coro_method)
  File "/opt/datalayer/opt/miniconda3/envs/datalayer/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
    return future.result()
  File "/Users/echar4/datalayer/repos/jupyter-kernel-mgmt/jupyter_kernel_mgmt/discovery.py", line 181, in find_kernels
    yield kernel_type, attributes
RuntimeError: Task got bad yield: ('pyimport/kernel', {'language_info': {'name': 'python', 'version': '3.7.3', 'mimetype': 'text/x-python', 'codemirror_mode': {'name': 'ipython', 'version': 3}, 'pygments_lexer': 'ipython3', 'nbconvert_exporter': 'python', 'file_extension': '.py'}, 'display_name': 'Python 3', 'argv': ['/opt/datalayer/opt/miniconda3/envs/datalayer/bin/python', '-m', 'ipykernel_launcher', '-f', '{connection_file}'], 'resource_dir': '/opt/datalayer/opt/miniconda3/envs/datalayer/lib/python3.7/site-packages/ipykernel/resources'})

Implement a conda_kernel_provider

Opening this to discuss how we can contribute the current nb_conda_kernel to be upgraded as a provider for kernel_mgmt.

cc/ @Zsailer @kevin-bates

Open discussion of various topics...

@takluyver - it's not clear to me where to open these discussions. I believe a Jupyter Server discourse category will be created soon and I'd like to see a Kernel Providers sub-category within it. Until then, I thought I'd open (for now) a single issue with multiple topics that, I believe, are worthy of discussion.

I'm happy to start working on these, but wanted to have the discussion first. I'm also copying @minrk, @rgbkrk, and @Carreau in case they have opinions. Anyone else, please feel free as well.

Whitelisting
Whitelisting is important in scenarios where many (dozens) of "kernelspecs" are configured. It provides a mechanism for administrators to isolate/limit which kernel types are available to the end user. I believe we need to preserve this functionality. Coupled with this PR in traitlets, admins could modify the whitelist without having to restart the server.

The current implementation doesn't consider whitelisting, nor is there a location from which to associate the trait. Was its omission intentional? If so, could you explain why?

Parameters
This came up in the Jupyter Server Design workshop. Each provider that supports parameterization will need to return the parameter metadata corresponding to their provided kernel. Rather than expose a new field returned from the REST API, I propose we make a parameters stanza within the already supported metadata stanza. The format of the parameters will be JSON which adhere's to a schema - the idea being that consuming applications will have what they need to appropriately prompt for values. Frontend applications that are interested in presenting parameters and gathering their values, would look for the parameters stanza in metadata. Those that aren't, will continue as they do today.

This implies that metadata should be (optionally) returned from find_kernels depending on if the provider supports parameters or needs to convey other information in that stanza.

I don't think there's any need to return argv. That strikes me as purely server and, for that matter, provider-specific.

Besides the parameter values, other information will need to be fed to the provider methods - so we probably want **kwargs on init or launch. Examples of this are configuration-related items and other items included the json body of the kernel start request.

Kernel Session Persistence
We need the ability to persist relevant information associated with a running kernel in order to reconnect to it from another server instance. The use-case here is high availability. While HA functionality may be provided as an extension to Jupyter Server, methods to save and load that information are necessary, as is the ability to "launch" from the persisted state.

Async Kernels
I think we should ONLY support async kernels in this new implementation. This implies that Jupyter Server will need support async kernels at its MappingKernelManager level. We can leverage the async kernel management work in the PRs already open on jupyter_client and notebook.

If this isn't the right mechanism to use for discussion, please let me know where that can take place.

Thank you.

Only support async methods for finding and launching kernels

Since this library represents a "fresh start", we should take this opportunity to make the find_kernels() and launch() methods support async behaviors and drop launch_async().

Work in this area was done (but never merged) via these PRs and we might find them to be helpful resources (although the dual approach would not be necessary): jupyter/jupyter_client#428 jupyter/notebook#4479

How unique should display names be?

We currently give kernel types relatively simple display names like "Python 3". The ability to have multiple independent kernel providers makes it more likely that multiple kernel types will have the same name, which is confusing for users.

I see two options for how we deal with this:

Have interfaces listing kernels show the kernel type ID alongside the kernel name, e.g. "Python 3 (spec/python3)".
Encourage kernel providers to provide more detailed display names which are more likely to be unique, e.g. "Python 3 (from kernelspec)" or "Python 3 (using server's Python)".

I favour option 1: while it might be a bit more intimidating for new users who only have one or two kernels available, it easily provides an unambiguous way to identify each kernel type. The identifiers will also probably be visible in other contexts (e.g. command line options). Finally, referring to things (variables, files, servers, etc.) by precise identifiers rather than vague descriptions is a key general concept which programmers have to be used to.

Unique kernel IDs

This came up on #42, and we wanted to spin out a separate issue. Questions include:

What part of the code should be responsible for assigning a kernel ID? The application? JKM? The kernel itself?
Are IDs persistent across restarts? What about if you checkpoint/restore a kernel process?

This is also going to play into a larger question about kernel discovery; I'll open a separate issue about that.

@kevin-bates responded to some of my questions already on #42:

Should a restarted kernel have the same ID as the original?

I think we must retain the kernel id on restarts. The MappingKernelManager relies on this fact - otherwise it would remove the current slot and create a new slot for the new kernel manager. In addition, I believe kernel providers should be able to do what they want within their kernel manager, but the kernel-id, IMO, is the public key for identifying a kernel throughout its lifetime (where restarts are considered within its lifetime).

I think I mentioned this before, but I believe KernelFinder.launch() should take an optional kernel_id=None parameter. This parameter would also be honored on initial launch as well and the restarter would then use it during restarts: self.kernel_finder.launch(self.kernel_type, cwd, kernel_id=self.kernel_manager.kernel_id). (We'd want to extend KernelManager's initializer in a similar manner - with an optional kernel_id parameter.)

What about if you can checkpoint/restore a kernel process - should the restored process have the same ID? What about if you restore the same checkpoint to two processes?

If these questions are directed at persisted sessions that get "revived" I think the id of the process is orthogonal to the id of the kernel. When/how is process id used by clients?

In EG, when a persisted session is "revived", for example if the server came down, and another started, EG does not restart the kernel process, but instead, re-establishes connection to the kernel process. It can do this because the kernel is remote and is (probably) still running.

If we want the server to be able to support active/active scenarios (which we plan to do for EG), retaining the kernel's id across restarts is paramount to that - otherwise restarts would require communicating the old and new kernel ids to other servers, etc. and I don't think we want to go there - especially since the current behavior is to retain the id.

Who should be able to discover what kernels? What part is responsible for advertising them? When should kernels be run without being advertised?

I guess I'm not familiar enough with the direct kernel application approach. For applications like notebook and jupyter-server, the list of running kernels should come directly from the MappingKernelManager. Once (well, if) we add the notion of users and roles, we could then apply filtering on the results of the /api/kernels request.

Failures on tornado 4.x

When I build this on openSUSE build system for Tumbleweed, I am getting lots of TypeError: 'Future' object is not iterable and AttributeError: module 'tornado.util' has no attribute 'TimeoutError' . I'm guessing it is incompatibility with the tornado version, which is stuck on the v4.5 branch due to latter versions having incompatibilities with salt. c.f. https://build.opensuse.org/package/show/devel:languages:python/python-tornado

https://build.opensuse.org/package/show/home:jayvdb:jupyter/python-jupyter-kernel-mgmt

Unfortunately most of the other parts of openSUSE's jupyter packages are all completely untested, so I am on shaking ground (and that might take a while to fix).

However it seems that either this project should have a minimum version of tornado, or a minimum version of jupyter_core/ipykernel/etc which provides a minimum version of tornado, so something like that.

[   19s] ======================================================================
[   19s] ERROR: test_inspect (jupyter_kernel_mgmt.tests.test_client.TestKernelClient)
[   19s] ----------------------------------------------------------------------
[   19s] Traceback (most recent call last):
[   19s]   File "/home/abuild/rpmbuild/BUILD/jupyter_kernel_mgmt-0.4.0/jupyter_kernel_mgmt/client.py", line 299, in wait_for_ready
[   19s]     yield from gen.with_timeout(second, reply_fut)
[   19s] TypeError: 'Future' object is not iterable
[   19s]
[   19s] During handling of the above exception, another exception occurred:
[   19s]
[   19s] Traceback (most recent call last):
[   19s]   File "/home/abuild/rpmbuild/BUILD/jupyter_kernel_mgmt-0.4.0/jupyter_kernel_mgmt/tests/test_client.py", line 28, in setUp
[   19s]     self.km, self.kc = start_new_kernel(kernel_cmd=make_ipkernel_cmd())
[   19s]   File "/home/abuild/rpmbuild/BUILD/jupyter_kernel_mgmt-0.4.0/jupyter_kernel_mgmt/subproc/launcher.py", line 322, in start_new_kernel
[   19s]     kc.wait_for_ready(timeout=startup_timeout)
[   19s]   File "/home/abuild/rpmbuild/BUILD/jupyter_kernel_mgmt-0.4.0/jupyter_kernel_mgmt/client.py", line 403, in wait_for_ready
[   19s]     loop.run_sync(lambda: self.loop_client.wait_for_ready(), timeout=timeout)
[   19s]   File "/usr/lib64/python3.7/site-packages/tornado/ioloop.py", line 457, in run_sync
[   19s]     return future_cell[0].result()
[   19s]   File "/usr/lib64/python3.7/site-packages/tornado/concurrent.py", line 237, in result
[   19s]     raise_exc_info(self._exc_info)
[   19s]   File "<string>", line 4, in raise_exc_info
[   19s]   File "/usr/lib64/python3.7/site-packages/tornado/gen.py", line 307, in wrapper
[   19s]     yielded = next(result)
[   19s]   File "/home/abuild/rpmbuild/BUILD/jupyter_kernel_mgmt-0.4.0/jupyter_kernel_mgmt/client.py", line 300, in wait_for_ready
[   19s]     except tornado.util.TimeoutError:
[   19s] AttributeError: module 'tornado.util' has no attribute 'TimeoutError'

Allow for multiple providers for kernelspec-based kernels

Hi @takluyver. I'd like to enable the ability to subclass KernelSpecProvider such that different kernel providers could still be based on kernelspecs yet have different launch and management capabilities.

The idea would be that non-default kernelspecs would include a provider_id indicator in the metadata of the kernel.json file. This indicator would match that of the provider performing the search. kernel.json files with no such entry or an entry = 'spec' would be handled by KernelSpecProvider.

To better accomplish this, it would good to merge KernelSpecProvider with KernelSpecManager. Is there a reason these are different - other than ease of initial POC? Otherwise, conveyance of self.id is a bit of a hassle and in makes sense (IMO) to have this functionality isolated in the provider.

Each of the subclasses of KernelSpecProvider then have the option of implementing their own launch() methods, etc. but the search for kernelspecs could then all be shared.

Since 100% of kernel launches today are based on kernelspecs, applications/features wishing to leverage 'provider-type' functionality - while still adhering to kernelspec as a foundation could be implemented.

I am happy to perform the merge/reorganization, but would like your input/opinion prior to spending time on this.

Controlling kernels & providers for testing and exclusion

In integrating this into the notebook, the need to test kernel machinery has come up. Finding a wide variety of kernel providers that are already installed is not ideal for testing: you want a more predictable environment.

This also reminded me that the old machinery allows for a whitelist of kernels to expose, and people may well want this in the new machinery.

Including/excluding kernels and kernel providers should probably be configured at the application level and passed in to KernelFinder. We could control this by exact names (spec/python3), glob patterns (spec/*) or regexes (spec/.*). If we're using patterns to select them, we may want to disallow or recommend against using special pattern characters in kernel type IDs.
For testing, we may want to go further, overriding the normal entrypoints discovery and specifying one or several kernel providers which may not expose entrypoints, so that the kernels available for testing can be precisely controlled, and can do strange things if needed. I'm not sure whether this should be configured by the application, or by e.g. an environment variable which the KernelFinder machinery picks up directly.

Document jupyter_kernel_mgmt

As discussed during jupyter_server meeting with @takluyver @Zsailer @kevin-bates we need to document this repo (at least an architecture diagram).

This is valid also for https://github.com/takluyver/jupyter_protocol which btw has already a docs folder with a basic content which is prolly outdated and with content that should be moved here (eg https://github.com/takluyver/jupyter_protocol/blob/master/docs/kernel_providers.rst).

Or we document both repos separately, or we document everything here. I believe it would be best to document each repos separately. Thoughts?

Release jupyter_kernel_mgmt 0.5.0

This issue is created to track the release jupyter_kernel_mgmt 0.5.0.

Convey additional parameters to kernel launch

This represents part of the effort to add support for parameterized kernels. The idea is that parameters will be conveyed from the front-end and be "processed" during the kernel's launch. In some cases, these parameters will be used by the kernel itself, but, I believe, most of the parameters will be used to create the right environment in which the kernel will run. Of course this portion will be the responsibility of the provider. We should just make sure the provider gets the information it needs via the kernel management framework.

Kernel instance discovery (discussion)

How should discovery of running kernel instances work? (See also #43, which we'll probably want to tackle first.)

At present, most kernels are started by the launcher writing a local connection file. This is used to start the kernel, but then it also works as a crude way for other applications on the same system to discover and connect to the kernel.

But if a kernel is launched remotely, there may not be a connection file on the launching system for it. And I'd like to redesign the start method for local kernels, which probably won't require a connection file at all. Since #42, jupyter-kernel writes its own connection file intended for clients to connect to the kernel it has launched, even if the launch didn't create a connection file. But this is an inelegant solution.

What kernels should applications be able to discover (besides the ones they have launched themselves)?

Any kernels running locally? (Under the same user account)
Any kernels launched by local applications? (which may include kernels running remotely or in VMs/containers)
Any kernel instance it could have launched itself? (i.e. tied to available kernel type providers)
Or even broader? E.g. if I'm on a cluster, should it be possible for a Jupyter application on one node to discover and connect to a kernel on another node, using some shared resource? What about zeroconf discovery on local networks?
Maybe kernel instance discovery should be pluggable, so people can experiment with different strategies for this kind of thing? Extensibility adds complexity, though.
Should the application launching a kernel be able to decide whether other applications can find it, or whether it should be 'private' to that application? If so, should the APIs for launching a kernel default to private or shared.

Format for kernel type ID?

Kernel type identifiers are currently slash delimited, provider/id, e.g. spec/python3. Everything after the first slash is passed to the provider to look up; I had thought this could support delegation, such as a kernel provider which can talk to different providers on a remote host, like ssh/mydesktop/spec/python3.

However, in working on supporting kernel type identifiers in the notebook server, it's come up that slashes are not very convenient in either URLs or HTML ids. The problems are not insurmountable, but if we want to change or restrict this interface, it's much easier to do while it's relatively new.

Options:

Switch to another delimiter. Dots . or colons : are obvious candidates, but are still awkward to use in HTML IDs because they have meaning in CSS selectors. Hyphens - and underscores _ are technically easier, but they don't feel like delimiters, and people may want to use them in the delimited names.
Restrict the characters allowed between delimiters, so there's an obvious way to escape a kernel type identifier (e.g. if underscores are disallowed, spec/python3 can be escaped as spec_python3). There are plenty of unicode possibilities if we don't mind making the escaped versions hard to type manually.
Limit identifiers to one delimiter - this would allow URL routing to distinguish the kernel ID from any following URL components.
Leave it as is, and work around the issues some other way (e.g. for URLs, percent-encode slashes as %2F).

I think I lean towards 4 or 2.

cc @minrk @Carreau for input.

For reference, HTML 5 IDs can contain any characters except space. The practical restrictions for IDs come more from CSS selectors and jQuery - jQuery needs escapes for :.[],=@. The reserved characters for URIs from RFC 3986 are:

":" / "/" / "?" / "#" / "[" / "]" / "@"
"!" / "$" / "&" / "'" / "(" / ")"
"*" / "+" / "," / ";" / "="

Write async nanny client

Add ability to convey application configuration to providers

Kernel providers, for the most part, will likely require configuration to perform their operation. For example, the YarnKernelProvider may need to know the url of the YARN resource manager against which applications (kernels) are submitted. Rather than ask administrators to add this information into the specific kernel specification (however that provider implements that), it would be good to have a central default value capability - where use of the Config class in traitlets makes sense.

As a result, we should add the ability for applications that leverage the kernel provider framework to convey the application's configuration to the providers. The provider would then be responsible for locating its "section" of the configurable classes or the configurable item itself within the set of configurations.

Is there any instruction to use it with (a fork of) notebook ?

Or any other clients.

I'm currently trying to write a SlurmKernelManager, and that seem more adequate as it is async all-the -way.

I can also try to "just figure it out", or maybe it's not time yet ?

Kernel Providers need to know the kernel's Id

Kernel providers need to discover where launched kernels are located (remote kernels launched across a cluster) and the obvious key to use is the kernel's id. As a result kernel_finder.launch should return the tuple of connection_info, manager, and id or (preferably) make id a required attribute of the returned KernelManager instance. Or the kernel's id should be plumbed through the launch sequence. Since that's a little messy and the former provides far greater flexibility, I'd vote for letting the provider create the Id and set it on the KernelManager. Its useful having it on the KernelManager anyway (logging, troubleshooting, etc.).

Some additional background: Enterprise Gateway has clients that convey the kernel id in the request because they also want to associate the launched kernel with resources they setup prior to the kernel's launch. This will likely be addressed via kernel parameters, but the important point is that the kernel provider needs the kernel id - either to create itself, or be conveyed.

This topic will likely come into play when talking about HA capabilities were a new server instance has taken over (Active/Passive) or is simultaneously in use (Active/Active) and the request for an existing kernel requires a kernel provider instance.

EDIT: I plan to submit a PR at some point once I finish migration of EG kernels to this model. If someone would rather address this I'm good with that. 😃

Fix KernelApp

While testing out the change for PR #39 I discovered that KernelApp (and test_kernelapp) have fallen through the cracks. This is because the test was launching the KernelApp from jupyter_client rather than from jupyter_kernel_mgmt!

When looking into KernelApp, it has quite a few issues. I've started making the necessary changes, and, for the most part, gotten the test to pass despite the fact that the KernelApp doesn't work completely.

I have pushed those changes here, but I'm hoping someone can fill in the blanks. I don't fully understand the purpose of the application. For example, it strikes me as odd that BlockingKernelClient is created only once KernelApp's shutdown method is called (via signal handlers).

Hmm, I'll go ahead and create a WIP PR with hopes that it can be a collaborative effort.

Thanks.