Git Product home page Git Product logo

Comments (20)

tikue avatar tikue commented on May 5, 2024 3

I've been doing some experiments with serde_cbor. It looks promising.

  1. It's compact: for small structs it is about the same size as bincode. Annotating the fields with serde(rename) (see below) is optional, but it can be useful for decreasing the field binary footprint as well as providing a stable API while supporting the renaming of a field.
  2. Unknown fields encountered during deserialization are ignored. This isn't cbor-specific except for the fact that cbor serializes field names whereas bincode doesn't (which means memory layout matters).
  3. Missing fields during deserialization can be initialized with the Default trait, allowing struct fields to be added backwards-compatibly.
    1. Enums are semi-supported, and what support is lacking is because serde's support is currently limited. What works: if an enum impls Default, then a struct field of that enum type can be omitted. However, unknown variants (e.g. adding a third variant to a 2-variant enum) currently cause things to blow up; that would/will be fixed by serde-rs/serde#912.

Example

extern crate serde;
#[macro_use]
extern crate serde_derive;
extern crate serde_cbor as cbor;

#[derive(Debug, Default, Deserialize, Serialize)]
// The default annotation allows the omission of fields;
// they'll be set using the Default trait.
#[serde(default)]
struct Foo {
    #[serde(rename = "0")]
    i: i32,
    #[serde(rename = "1")]
    bar: Bar,
    #[serde(rename = "3")]
    qux: Qux,
}

#[derive(Debug, Default, Deserialize, Serialize)]
#[serde(default)]
struct FooBig {
    #[serde(rename = "0")]
    renamed_i: i32,
    #[serde(rename = "2")]
    j: i32,
    #[serde(rename = "1")]
    renamed_bar: BarBig,
    #[serde(rename = "3")]
    qux: QuxBig,
}

#[derive(Debug, Default, Deserialize, Serialize)]
#[serde(default)]
struct Bar {
    #[serde(rename = "0")]
    baz: String,
}

#[derive(Debug, Default, Deserialize, Serialize)]
#[serde(default)]
struct BarBig {
    #[serde(rename = "1")]
    z: [u8; 1],
    #[serde(rename = "0")]
    baz: String,
}

#[derive(Debug, Deserialize, Serialize)]
enum Qux {
    #[serde(rename = "0")]
    Unknown,
    #[serde(rename = "1")]
    First {
        #[serde(rename = "0")]
        x: i32,
    },
}

#[derive(Debug, Deserialize, Serialize)]
enum QuxBig {
    #[serde(rename = "0")]
    Unknown,
    #[serde(rename = "1")]
    First {
        #[serde(rename = "0")]
        #[serde(default)]
        x: i32,
        #[serde(rename = "1")]
        #[serde(default)]
        y: u64,
    },
}

impl Default for Qux {
    fn default() -> Self {
        Qux::Unknown
    }
}

impl Default for QuxBig {
    fn default() -> Self {
        QuxBig::Unknown
    }
}

fn main() {
    let foo = FooBig {
        renamed_i: 4,
        j: 5,
        renamed_bar: BarBig {
            z: [1],
            baz: "baz".to_string(),
        },
        qux: QuxBig::First { x: -5, y: 10 },
    };
    println!("Original: {:#?}", foo);
    println!();

    let bytes = cbor::to_vec(&foo).unwrap();
    println!("Serialized: {:?} (len: {})", bytes, bytes.len());

    let foo: Foo = cbor::from_slice(&bytes).unwrap();
    println!("big to small: {:#?}", foo);
    println!();

    let bytes = cbor::to_vec(&foo).unwrap();
    let foo: FooBig = cbor::from_slice(&bytes).unwrap();
    println!("small to big: {:#?}", foo);
}

from tarpc.

tikue avatar tikue commented on May 5, 2024 2

The current version of tarpc uses bincode for serialization. I believe but can't currently verify that the size of an enum on the wire is u32 + sizeof(variant), so I think adding new methods is backwards-compatible.

I've been working on-and-off on a new version of tarpc internals that makes the transport 100% pluggable. This will allow users to encode the messages however they'd like. It's still very much work-in-progress at this moment, and I haven't made a branch on github yet. I like this avenue of exploration, though, because it would allow for users to define their own backwards-compatible wire protocols. This would free me to focus development efforts on the client and server implementations and higher-level features.

from tarpc.

Boscop avatar Boscop commented on May 5, 2024 1

@tikue Thanks for the quick reply, I'm looking forward to a better way to evolve the protocol.

But is it true that all signatures of the service!{} methods are mapped to an enum? (One case per method, args as members of the case.)

from tarpc.

tikue avatar tikue commented on May 5, 2024

cc @shaladdle

Hey, thanks for asking this! This is definitely something I (and I feel comfortable speaking for @shaladdle as well) have been thinking about, but there are a lot of moving pieces and the best approach remains elusive. I think in the long run tarpc is going to have to take a firmer stance. For now, introducing new versions of RPCs seems like a good, simple solution, if not the most elegant.

One long-term solution might be to require args to impl Default and mandate that new args only be appended to the end of the RPC args list. Then something like #[serde(default)] described in bincode-org/bincode#179 could work. This wouldn't allow you to remove args, but the client could always just supply the default.

from tarpc.

tikue avatar tikue commented on May 5, 2024

I haven't done any performance comparisons yet. I doubt there would be many surprises, but I do want to see if cbor can handle Vec optimally.

from tarpc.

Boscop avatar Boscop commented on May 5, 2024

This is great, we are deploying clients on embedded devices that only have internet via prepaid sim cards so we want to reduce the traffic as much as possible. JSON has a lot of overhead due to its textual format and base64 for binary data..
By "if cbor can handle Vec optimally" do you mean storing the object keys only once for all items in the Vec? It should be possible by writing a custom serde serializer/deserializer for Vecs.

Btw, do you know the encoded size of objects in CBOR vs Proto? Is CBOR encoded data smaller?

Btw, this project also looks interesting: https://github.com/tailhook/probor

from tarpc.

tikue avatar tikue commented on May 5, 2024

Oh, sorry, I meant the handling of Vec<u8>, which naively might be serialized by iterating over each individual byte, rather than writing all bytes in one go.

from tarpc.

Boscop avatar Boscop commented on May 5, 2024

Does that mean that tarpc will switch to CBOR soon?
(I'd prefer CBOR over bincode because of the ability to evolve the API better).

from tarpc.

tikue avatar tikue commented on May 5, 2024

I think switching away from bincode to another serde-compatible binary format that supports API evolution is imminent. CBOR is a good fit but I'm going to shop around a bit / discuss with @shaladdle, etc.

from tarpc.

Boscop avatar Boscop commented on May 5, 2024

Maybe the Protobuf or Thrift binary format would be better than CBOR, if in the case of Vec<T> they encode the keys of T only once..

from tarpc.

tikue avatar tikue commented on May 5, 2024

I haven't seen a protobuf or thrift crate that supports serde. Have you? The main requirements for tarpc's serialization protocol are:

  1. Serde compatible
  2. Supports API evolution
  3. Fast
  4. Compact

from tarpc.

Boscop avatar Boscop commented on May 5, 2024

What are the contenders, other than CBOR?
Maybe it would make sense to use a modified CBOR (with the optimized Vec<T> encoding)?
Since all Vecs will be homogeneous.

from tarpc.

tikue avatar tikue commented on May 5, 2024

CBOR is only the first I've spent time looking into. Will update here over the coming days as I investigate the various options.

from tarpc.

Boscop avatar Boscop commented on May 5, 2024

I haven't seen a protobuf or thrift crate that supports serde. Have you?

Just found this (not complete yet but looks promising):
https://github.com/dflemstr/rq/tree/master/serde-protobuf
There is also serde-avro in the same repo:
https://github.com/dflemstr/rq/tree/master/serde-avro

Both can only deserialize right now..

Btw, why not msgpack?
https://github.com/3Hren/msgpack-rust

Edit: This person recommends ASN.1:

Personally, despite my love/hate relationship with it, I'd probably use ASN.1 for most RPC and message transmission purposes

There are a couple of ASN.1 crates but none with serde support afaik:
serde-rs/serde#292

There are some interesting graphs in this answer:

Uber recently evaluated several of these libraries on their engineering blog
The winner for them? MessagePack + zlib for compression

from tarpc.

tikue avatar tikue commented on May 5, 2024

Thanks, I'll look into those further! I also just saw this Reddit post able ssmarshal.

Edit: Ah nevermind, this is a non-self-describing format, like bincode.

from tarpc.

tikue avatar tikue commented on May 5, 2024

And now there's prost!

from tarpc.

Boscop avatar Boscop commented on May 5, 2024

When the toplevel Message type is an enum, how to deprecate old message cases while clients are deployed using different versions (but none are using those message cases anymore)?

Assuming I have previously checked that no deployed clients are still using a version of the protocol that makes them send those Message cases.
So the iteration for changing certain Message cases is:

  • Add new Message cases to the enum
  • Change client code to use new Message cases instead of old ones
  • Push updates to clients (rather: let them ask for it, because with tarpc, clients have to do a request first)
  • When all clients have updated to the new Message cases: deprecate/remove the old cases

They can't just be removed from the enum because it would invalidate the tag id (newly compiled code would map a different tag id to a message case than deployed client code), right?

Should all "deleted" message cases be replaced by DeprecatedN?
E.g. if I have:

enum Message {
    Foo(Foo),
    Bar(Bar),
    Baz(Baz),
}

And I want to add a field to the Foo case and "remove" the Bar case, should I turn it into:

enum Message {
    Deprecated0,
    Deprecated1,
    Baz(Baz),
    Foo2(Foo, Bla),
}

to keep the tag id the same between newly-compiled server code and deployed client code?
(So new cases can only be added at the end. To get a new tag id.)
Or how should I "remove" them?

from tarpc.

Boscop avatar Boscop commented on May 5, 2024

Are all signatures of the service!{} methods mapped to an enum?

When I have an enum like enum Service1 { Method1(u8) } that I'm serializing with bincode but then I try to deserialize into enum Service2 { Method1(u8), Method2(u128) }, will it work, even though the enums have different sizes?

With the current tarpc version, is it safe to add more methods to a service!{} on the server, while clients are deployed? "Safe" meaning that the deployed clients can still communicate with the server (calling the methods that were part of the service!{} when their software was compiled).

Or will that make the server and client protocol incompatible due to the differences in the binary size of the enum that the service!{} represents?

from tarpc.

tikue avatar tikue commented on May 5, 2024

Sorry, yes, that is indeed the case.

from tarpc.

tikue avatar tikue commented on May 5, 2024

I believe, now that tarpc is transport-agnostic, that this is an external concern. Closing but feel free to reopen if you think I've missed this!

from tarpc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.