Git Product home page Git Product logo

native_model's Introduction

Native model

Crates.io Build Test Release Documentation License

Add interoperability on the top of serialization formats like bincode, postcard etc.

See concepts for more details.

Goals

  • Interoperability: Allows different applications to work together, even if they are using different versions of the data model.
  • Data Consistency: Ensure that we process the data expected model.
  • Flexibility: You can use any serialization format you want. More details here.
  • Performance: A minimal overhead (encode: ~20 ns, decode: ~40 ps). More details here.

Usage

       Application 1 (DotV1)        Application 2 (DotV1 and DotV2)
                |                                  |
   Encode DotV1 |--------------------------------> | Decode DotV1 to DotV2
                |                                  | Modify DotV2
   Decode DotV1 | <--------------------------------| Encode DotV2 back to DotV1
                |                                  |
// Application 1
let dot = DotV1(1, 2);
let bytes = native_model::encode(&dot).unwrap();

// Application 1 sends bytes to Application 2.

// Application 2
// We are able to decode the bytes directly into a new type DotV2 (upgrade).
let (mut dot, source_version) = native_model::decode::<DotV2>(bytes).unwrap();
assert_eq!(dot, DotV2 { 
    name: "".to_string(), 
    x: 1, 
    y: 2 
});
dot.name = "Dot".to_string();
dot.x = 5;
// For interoperability, we encode the data with the version compatible with Application 1 (downgrade).
let bytes = native_model::encode_downgrade(dot, source_version).unwrap();

// Application 2 sends bytes to Application 1.

// Application 1
let (dot, _) = native_model::decode::<DotV1>(bytes).unwrap();
assert_eq!(dot, DotV1(5, 2));
  • Full example here.

Serialization format

You can use default serialization formats via the feature flags, like:

[dependencies]
native_model = { version = "0.1", features = ["bincode_2_rc"] }

Each feature flag corresponds to a specific minor version of the serialization format. In order to avoid breaking changes, the default serialization format is the oldest one.

Custom serialization format

Define a struct with the name you want. This struct must implement native_model::Encode and native_model::Decode traits.

Full examples:

Others examples, see the default implementations:

Data model

Define your model using the macro native_model.

Attributes:

  • id = u32: The unique identifier of the model.
  • version = u32: The version of the model.
  • with = type: The serialization format that you use for the Encode/Decode implementation. Setup here.
  • from = type: Optional, the previous version of the model.
    • type: The previous version of the model that you use for the From implementation.
  • try_from = (type, error): Optional, the previous version of the model with error handling.
    • type: The previous version of the model that you use for the TryFrom implementation.
    • error: The error type that you use for the TryFrom implementation.
use native_model::native_model;

#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 1)]
struct DotV1(u32, u32);

#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 2, from = DotV1)]
struct DotV2 {
    name: String,
    x: u64,
    y: u64,
}

// Implement the conversion between versions From<DotV1> for DotV2 and From<DotV2> for DotV1.

#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 3, try_from = (DotV2, anyhow::Error))]
struct DotV3 {
    name: String,
    cord: Cord,
}

#[derive(Deserialize, Serialize, PartialEq, Debug)]
struct Cord {
    x: u64,
    y: u64,
}

// Implement the conversion between versions From<DotV2> for DotV3 and From<DotV3> for DotV2.

Status

Early development. Not ready for production.

Concepts

In order to understand how the native model works, you need to understand the following concepts.

  • Identity(id): The identity is the unique identifier of the model. It is used to identify the model and prevent to decode a model into the wrong Rust type.
  • Version(version) The version is the version of the model. It is used to check the compatibility between two models.
  • Encode: The encode is the process of converting a model into a byte array.
  • Decode: The decode is the process of converting a byte array into a model.
  • Downgrade: The downgrade is the process of converting a model into a previous version of the model.
  • Upgrade: The upgrade is the process of converting a model into a newer version of the model.

Under the hood, the native model is a thin wrapper around serialized data. The id and the version are twice encoded with a little_endian::U32. That represents 8 bytes, that are added at the beginning of the data.

+------------------+------------------+------------------------------------+
|     ID (4 bytes) | Version (4 bytes)| Data (indeterminate-length bytes)  |
+------------------+------------------+------------------------------------+

Full example here.

Performance

Native model has been designed to have a minimal and constant overhead. That means that the overhead is the same whatever the size of the data. Under the hood we use the zerocopy crate to avoid unnecessary copies.

๐Ÿ‘‰ To know the total time of the encode/decode, you need to add the time of your serialization format.

Resume:

  • Encode: ~20 ns
  • Decode: ~40 ps
data size encode time (ns) decode time (ps)
1 B 19.769 ns - 20.154 ns 40.526 ps - 40.617 ps
1 KiB 19.597 ns - 19.971 ns 40.534 ps - 40.633 ps
1 MiB 19.662 ns - 19.910 ns 40.508 ps - 40.632 ps
10 MiB 19.591 ns - 19.980 ns 40.504 ps - 40.605 ps
100 MiB 19.669 ns - 19.867 ns 40.520 ps - 40.644 ps

Benchmark of the native model overhead here.

native_model's People

Contributors

flrgh avatar renovate[bot] avatar semantic-release-bot avatar vincent-herlemont avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

flrgh

native_model's Issues

Using Profile-Guided Optimization (PGO) to reduce the encode overhead even more

Hi!

Similarly to vincent-herlemont/native_db#92 I decided to perform PGO benchmarks on native_model. Here are the results.

Test environment

  • Fedora 39
  • Linux kernel 6.6.9
  • AMD Ryzen 9 5900x
  • 48 Gib RAM
  • SSD Samsung 980 Pro 2 Tib
  • Compiler - Rustc 1.75
  • native_model version: the latest for now from the main branch on commit 62b1e7cc35e64bce9feb22d6727a4f66fc9b9660
  • Disabled Turbo boost (for more stable results across benchmark runs)

Benchmark

For benchmark purposes, I use built-in benchmarks with cargo bench command. For PGO optimization I use cargo-pgo tool. The same benchmark suite was used for the PGO training phase built with cargo pgo bench. PGO optimized results I got with cargo pgo optimize bench.

All measurements are done multiple times to check reproducibility - the results are stable across runs.

Results

I got the following results:

At least according to the results above, PGO helps with achieving better overall performance with native_model. Probably the PGO-optimized build can suggest a way how to optimize native_model more aggressively (via comparing ASM for PGOed and non-PGOed native_model versions).

Further steps

I can suggest the following action points:

  • Perform more PGO benchmarks on native_model. If it shows improvements - add a note to the documentation about possible improvements in native_model performance with PGO. So native_model users will be aware of PGO effects on native_model performance and can decide to enable PGO for their native_model-based applications to achieve better performance.

Please treat the issue just as a benchmark report, not a problem or something like that. I created the issue just because the discussions are not enabled in this repo.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

cargo
Cargo.toml
  • zerocopy 0.7.32
  • thiserror 1.0
  • anyhow 1.0
  • serde 1.0
  • bincode_1_3 1.3
  • bincode_2_rc 2.0.0-rc.3
  • postcard_1_0 1.0
  • serde_json 1.0
  • criterion 0.5.1
  • skeptic 0.13
  • skeptic 0.13
native_model_macro/Cargo.toml
  • syn 2.0
  • quote 1.0
  • proc-macro2 1.0.78
tests_crate/Cargo.toml
  • serde 1.0
  • bincode 2.0.0-rc.3
  • postcard 1.0
  • anyhow 1.0
github-actions
.github/workflows/build_and_test_release.yml
  • actions/checkout v4@8ade135a41bc03ea155e62e844d188df1ea18608
  • actions-rs/toolchain v1
  • extractions/setup-just v1
  • hustcer/setup-nu v3.8
  • actions/checkout v4@8ade135a41bc03ea155e62e844d188df1ea18608
  • actions/setup-node v4
  • cycjimmy/semantic-release-action v3
.github/workflows/conventional_commits.yml
  • actions/checkout v4@8ade135a41bc03ea155e62e844d188df1ea18608
  • webiny/action-conventional-commits v1.2.0

  • Check this box to trigger a request for Renovate to run again on this repository

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.