Git Product home page Git Product logo

Comments (30)

njsmith avatar njsmith commented on June 14, 2024 3

Another approach to consider would be:

class Todo:
    datagrams_to_send: List[Tuple[bytes, NetworkAddress]]
    events_received: List[Event]
    max_update_time: float

class QUICConnection:
    def update(
        *,
        events_to_send: List[Event] = (),
        datagrams_received: List[Tuple[bytes, NetworkAddress]] = (),
        current_time: float,
    ) -> Todo:
       ...

So the idea is that from time to time the user gives the sans-io library an update on what's going on from its side. This includes zero-or-more new high-level Events that it wants to happen, and zero-or-more new datagrams that it has received. It always includes the current time, on whatever clock it's using. The sans-io library responds by giving the user some homework: here are zero-or-more high-level events that happened for you to deal with, here are zero-or-more datagrams that need to be sent, and in any case you better call update again once your clock reaches return_value.max_update_time, whether you have anything new to tell me or not.

I can think of one major reason why this might not be a good idea. If you want to expose ancillary state variables on the sans-io connection object, and you think that users might want to refer to those state variables while they're looping through the new events that arrived, then you really want an h11-style next_event API that dribbles out the events one-at-a-time.

These issues are super subtle, and we're still collectively figuring out the best way to cope with them as a design community. But to catch up with previous discussions, I highly recommend reading both of these discussions:

CC: @Lukasa

from aioquic.

jlaine avatar jlaine commented on June 14, 2024 2

Both the QUIC and HTTP APIs are now sans-io so unless anyone objects I'll close this issue tomorrow. Feel free to open additional issues for changes to the API.

from aioquic.

jlaine avatar jlaine commented on June 14, 2024 1

@njsmith thanks for the input!

I've started ripping out the asyncio-specifics in the sans-io branch, for now there is a next_event method but once I get my test coverage back up I'll give this update() method approach a try.

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

Hi Seth!

At the moment the coupling to I/O is fairly weak: if you're willing to deal with the nitty-gritty details you can instantiate QuicConnection and pass it anything that looks like a DatagramTransport. This is already a requirement for me, as the end goal is to use aioquic on top of aioice (to support Interactive Connectivity Establishment - for WebRTC) instead of directly on a DatagramTransport.

I do however want to make sure the API for regular users is as simple as the current connect() and serve() coroutines, as otherwise it is fairly complex.

What are your specific requirements?

from aioquic.

sethmlarson avatar sethmlarson commented on June 14, 2024

My requirements are that of a library author. :) Basically want to be able to use synchronous or any flavor of asynchronous (asyncio, trio, curio) datagram transport by abstracting away the serializing and de-serializing logic. I'll take a close look at what can already be accomplished with the QuicConnection interface. Thank you for this library! :)

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

I'd definitely like the Python ecosystem to be able to leverage a common QUIC base, the protocol is frighteningly complex and I don't think we'd benefit from fragmentation. Feel free to poke around and let me know if you see API decisions which are likely to be problematic.

At the moment I am trying to ensure maximum interoperability with other implementations and locking this down with unit tests. I'll try and keep the public API surface to a minimum, and we can revisit the internal APIs based on your feedback.

By the way if you want to help out: one blocker currently for interop tests is the lack of HTTP/3 support in my client / server examples. One stumbling block here is supporting the QPACK header compression, which is not identical to HPACK. We can however leverage the massive huffmann table from: https://github.com/python-hyper/hpack

from aioquic.

sethmlarson avatar sethmlarson commented on June 14, 2024

Here's some issues that I've identified:

  • Use of the dataclasses module restricts to Python 3.7+
  • Having async/await as a primary import restricts to Python 3.6+
  • The QuicConnection object uses async within it's mutators
  • Because this protocol deals with individual datagrams and the sizes being very important (compared to other HTTP versions) we'd need to portion off calls to data_to_send() to be datagram_to_send(). :)

from aioquic.

tomchristie avatar tomchristie commented on June 14, 2024

I'd be super-invested in this too.

I think the Sans-I/O pattern isn't just valuable in that it allows alternate concurrency models to build with it, but that it's also just plain more resilient, understandable, and testable. (Especially when approached as "feed events in, get bytes out" style, rather than through callbacks)

Worth having a bit of look at some of the projects currently under the https://github.com/python-hyper/ umbrella. Particularly the h11 API, as being one of the simpler-ish cases.

My motivation here is looking towards HTTP/3 - since I can see how that'd fit in really nicely with https://github.com/encode/httpcore

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

Regarding your comments:

  • dataclasses: is there a backport package somewhere? Currently aioquic is 100% typed and if possible I'd quite like to keep it that way. I'd rather not have attrs.

  • async / await in QuicConnect: that doesn't seem too hard to fix, there are 2 such calls. We could rename QuicConnection to QuicConnectionProtocol and expose a different object for the user-facing API

The rest is a bit abstract, maybe some WIP PRs would steer me in the right direction?

from aioquic.

sethmlarson avatar sethmlarson commented on June 14, 2024

I'm catching you at a bad time for myself to provide a PR (going to be without internet connection for the next 3 days) but I will return to this with more direction. Thank you for your openness to this idea. :)

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

I have read the h2 documentation and indeed like the sans-io pattern, so I will rework aioquic to provide the asyncio-specific wrapper in an aioquic.asyncio module and keep the core asyncio-free.

One thing which differs from h2 is that QUIC does not rest on a reliable transport. As such, timers are needed for retransmissions. What would be the right API to inform the user that they need to arm a timer? Should I use an event for this? How should I express "when" the timer needs to fire, since I don't know the time reference of the event loop?

Similarly, receive_datagram would not be the only method which can return events. There would also need to be a handle_timer which can also return events (though fewer).

@tomchristie any thoughts?

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

What I've got in mind so far:

  • receive_datagram(data, addr) -> List[Event] : equivalent of h2's receive_data

  • datagrams_to_send() -> List[Tuple[bytes, NetworkAddress]]

  • get_timer() -> Optional[float] to be called after sending data, tells you when the next timet needs to be set

  • handle_timer() -> List[Event] to be called when the timer fires

from aioquic.

sethmlarson avatar sethmlarson commented on June 14, 2024

If you take a look at h11 it's implemented with next_event() as the singular interface for grabbing events from the state machine. What do you think about having the handle_timer() callback or the like inject events into the stream of events that would be received by next_event()?

And if you already have a method that is called before each send what are your thoughts on having the Optional[float] coming out of datagrams_to_send()?

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

I'll read the h11 and get back to you, many thanks for the insights.

One thing which irks me is that even it the simple form which returns a list of datagrams, the API consumer code is going to be a lot less pretty than the h2 equivalent:

socket.sendall(connection.data_to_send())

At the very least it's going to be:

for data, addr in connection.datagrams_to_send():
    sock.sendto(data, addr)

With a timer this would be:

datagrams, timeout = connection.datagrams_to_send()
for data, addr in datagrams:
    sock.sendto(data, addr)
update_or_stop_timer(timeout)

.. and this would need calling after datagram reception, or calling any mutators.

from aioquic.

sethmlarson avatar sethmlarson commented on June 14, 2024

I think regardless of what we do we aren't going to have the same niceness of other sans-I/O libraries. I think the single source for events and single source of "stuff you the implementer must do" is nice though.

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

OK I've just landed a patch on master which moves anything asyncio-related into the aioquic.asyncio module. There might still be a couple of kinks, but the (new) QuicConnectionProtocol._send_pending() method illustrates the steps required:

def _send_pending(self) -> None:

The general idea is that after calling a mutator (data received, data to send, close connection, send a ping..) you need to:

  • fetch events from the event buffer using next_event()
  • send out datagrams returned by datagrams_to_send()
  • set a timer as returned by get_timer()

@sethmlarson is this going in the direction you want?

from aioquic.

sethmlarson avatar sethmlarson commented on June 14, 2024

Given that the _send_pending doesn't observe/use any state of QuicConnection while the events are being evaluated do you think an approach like @njsmith suggested is possible? It seems like self._loop.time() is being piped into the state machine in a couple of places, better to minimize the number of mutators if that's possible.

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

Regarding the .update() API I'm wondering if the operations passed as input should really be called "events". Shouldn't we have distinct terminologies for what is fed in / out of the state machine?

from aioquic.

sethmlarson avatar sethmlarson commented on June 14, 2024

Since the interface is symmetrical once you're done with setting up the connection (correct me if I'm wrong here) do we need a distinction at the event class level?

from aioquic.

njsmith avatar njsmith commented on June 14, 2024

Well, having a uniform set of Events for both input and output worked well for h11 and wsproto, so that's why I wrote it that way. I don't know much about QUIC, so I don't know if it's similar or different somehow.

Part of why it works for those protocols is that when you switch from client to server, most of the input events turn into output events and vice-versa.

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

I think I need to read some more code, I'm confused as to what these "input events" are. Would this include "commands" such as "send this data on a stream for me" or "tear down the connection"? Or is it only "a timer fired" or "a datagram was received"?

from aioquic.

njsmith avatar njsmith commented on June 14, 2024

Ah right sorry. "Event" is a term of art here. In h11, event objects are specifically a representation of things that can happen at the level of the abstract protocol semantics. Stuff like "a request with these headers" or "a part of a message body". You can think of h11 as a big transducer: you give it bytes and it tells you what events just happened, or you tell it what events you want to make happen and it tells you what bytes to send.

If you read through the h11 tutorial then it emphasizes this way of thinking: https://h11.readthedocs.io/en/latest/basic-usage.html

Or here's the reference docs on events: https://h11.readthedocs.io/en/latest/api.html#events

from aioquic.

njsmith avatar njsmith commented on June 14, 2024

Just happened to notice that Cloudflare's QUIC library is sans-io, so their API might be useful to compare to:

The application is responsible for fetching data from the network (e.g. via sockets), passing it to quiche, and sending the data that quiche generates back into the network. The application also needs to handle timers, with quiche telling it when to wake-up (this is required for retransmitting lost packets once the corresponding retransmission timeouts expire for example). This leaves the application free to decide how to best implement the I/O and event loop support

https://blog.cloudflare.com/enjoy-a-slice-of-quic-and-rust/

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

Thanks for the link @njsmith !

A couple of points I noted:

  • The send and recv APIs only seem to pass in / out data, and no network address. I've reached out to the authors about this as I don't understand how this works. To the best of my knowledge the QUIC stack needs to be aware of the network path to handle things such as path validation or switching to a server's preferred address. The "address" doesn't need to be interpreted by the QUIC stack, it can be opaque, but it does need to support an equivalent of __eq__.

  • The timeout / timer API is interesting, it returns the duration (from the current time) when a timeout needs to occur. This simplifies the API surface (as you don't need to pass in the "current time"). It also means that the QUIC stack has its own notion of time which may or may not be based on the same clock as the API user's.

  • There is no "events" API at all.

    • To receive data from streams, the connection object provides a sort of iterator on the "readable" streams which can then be queried individually. I'm guessing this eases handling back-pressure as you explicitly know when the app consumes data.

    • I don't know how the app is informed of other events such as "the handshake completed", "the remote peer closed the connection", "the connection died following an idle timeout" or "the stream got reset". It looks like the app might need to poll conn.is_established, conn.is_closed and the like?

  • The HTTP/3 layer is instantiated by explicitly passing in the QUIC connection, as opposed to the HTTP/3 layer creating the QUIC connection itself. This is a choice I've been struggling with: on one hand letting the HTTP/3 layer manage the QUIC connection makes the API simpler, on the other hand I don't see how you can implement dual support for HTTP/3 and another protocol (e.g. legacy HTTP - which is used in interop tests) based on the ALPN negotiation if the HTTP/3 layer "owns" the underlying QUIC connection.

from aioquic.

njsmith avatar njsmith commented on June 14, 2024

The timeout / timer API is interesting, it returns the duration (from the current time) when a timeout needs to occur

Hmm, yeah, that seems bad. IMO you definitely need to let the user control the clock. People care about this. That sounds really difficult if you never pass in the time from outside. I guess in theory you could do it by having the API report back "in X seconds please give me this token", and then the user has to keep track of a bunch of tokens and give them back at the right moments, but... that sounds way more awkward and inefficient than just letting the user tell you what time it is on their favorite clock.

To receive data from streams, the connection object provides a sort of iterator on the "readable" streams which can then be queried individually. I'm guessing this eases handling back-pressure as you explicitly know when the app consumes data.

Interesting! The way h2 handles this is that it gives back events saying "here's some data", but the transport layer doesn't actually move the receive window until the application says "okay I processed that data", in a second step.

from aioquic.

LPardue avatar LPardue commented on June 14, 2024

Contributor to the quiche project here. Thanks for sharing your notes @jlaine.

@njsmith wrote:

Interesting! The way h2 handles this is that it gives back events saying "here's some data", but the transport layer doesn't actually move the receive window until the application says "okay I processed that data", in a second step.

It's worth highlighting that quiche's HTTP/3 API works a little differently to the transport API. An application that uses quiche will generally read by calling in order: socket.read(), quiche_conn.read(), quiche_h3_poll().The HTTP/3 poll() method is responsible for iterating over readable transport streams and converting them into HTTP; i.e. It returns event objects relating to reception of Headers or payload data. We recently reworked things a little to behave like you describe - when a Data event happens the application calls recv_body() to actually pull the data out of the receive window.

from aioquic.

sethmlarson avatar sethmlarson commented on June 14, 2024

Woo! 🎉 Thanks @jlaine! :) This will make this library easy to use for HTTP clients.

from aioquic.

salotz avatar salotz commented on June 14, 2024

Should advertise this and link to the sans-I/O "manifesto" page https://sans-io.readthedocs.io/

from aioquic.

jlaine avatar jlaine commented on June 14, 2024

@salotz you mean like this?

https://aioquic.readthedocs.io/en/latest/design.html#sans-io-apis

from aioquic.

salotz avatar salotz commented on June 14, 2024

🤕 yep! I only looked at the README :P

from aioquic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.