Git Product home page Git Product logo

zig-protobuf's Introduction

zig-protobuf


Welcome!

This is an implementation of google Protocol Buffers version 3 in Zig.

Protocol Buffers is a serialization protocol so systems, from any programming language or platform, can exchange data reliably.

Protobuf's strength lies in a generic codec paired with user-defined "messages" that will define the true nature of the data encoded.

Messages are usually mapped to a native language's structure/class definition thanks to a language-specific generator associated with an implementation.

Zig's compile-time evaluation becomes extremely strong and useful in this context: because the structure (a message) has to be known beforehand, the generic codec can leverage informations, at compile time, of the message and it's nature. This allows optimizations that are hard to get as easily in any other language, as Zig can mix compile-time informations with runtime-only data to optimize the encoding and decoding code paths.

State of the implementation

This repository, so far, only aims at implementing protocol buffers version 3.

The latest version of the zig compiler used for this project is 0.12.0.

This project is currently able to handle all scalar types for encoding, decoding, and generation through the plugin.

How to use

  1. Add protobuf to your build.zig.zon.
    .{
        .name = "my_project",
        .version = "0.0.1",
        .paths = .{""},
        .dependencies = .{
            .protobuf = .{
                .url = "https://github.com/Arwalk/zig-protobuf/archive/<some-commit-sha>.tar.gz",
                .hash = "12ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff",
                // leave the hash as is, the build system will tell you which hash to put here based on your commit
            },
        },
    }
  2. Use the protobuf module
    pub fn build(b: *std.Build) !void {
        // first create a build for the dependency
        const protobuf_dep = b.dependency("protobuf", .{
            .target = target,
            .optimize = optimize,
        });
    
        // and lastly use the dependency as a module
        exe.root_module.addImport("protobuf", protobuf_dep.module("protobuf"));
    }

Generating .zig files out of .proto definitions

You can do this programatically as a compilation step for your application. The following snippet shows how to create a zig build gen-proto command for your project.

const protobuf = @import("protobuf");

pub fn build(b: *std.Build) !void {
    // first create a build for the dependency
    const protobuf_dep = b.dependency("protobuf", .{
        .target = target,
        .optimize = optimize,
    });
    
    ...

    const gen_proto = b.step("gen-proto", "generates zig files from protocol buffer definitions");

    const protoc_step = protobuf.RunProtocStep.create(b, protobuf_dep.builder, target, .{
        .destination_directory = .{
            // out directory for the generated zig files
            .path = "src/proto",
        },
        .source_files = &.{
            "protocol/all.proto",
        },
        .include_directories = &.{},
    });

    gen_proto.dependOn(&protoc_step.step);
}

If you're really bored, you can buy me a coffe here.

ko-fi

zig-protobuf's People

Contributors

arwalk avatar hendriknielaender avatar jcalabro avatar malcolmstill avatar menduz avatar nefixestrada avatar vesim987 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

zig-protobuf's Issues

Manage errors at decoding

There is currently no check on the validity of the values (as in, varint decoded is bigger than actually possible to fit in the struct)

Need to do this for any kind of varint.

Impact on decode_varint (if it doesnt fit, return error) and on the transformation done to fit the final structure in pb_decode

build error with zig 0.10.0

./build.zig:24:13: error: no member named 'path' in struct 'std.build.Pkg'
.path = std.build.FileSource{.path = "src/protobuf.zig"}

Manage OneOf encoding

Parent field has a field_type "OneOf"

Make tagged unions, tagged union has a _union_desc_table too
Generate a function that returns the active tag's value + field descriptor

Unable to find module when adding to `build.zig`

Trying to follow the instructions in the README, but I zig seems unable to find the module.

I have tried with both zig v0.11.0 and 0.12.0-dev.2587+a1b607acb

❯ zig build --help
thread 3057254 panic: unable to find module 'protobuf'
/Users/stephen/.local/bin/zig/lib/std/debug.zig:434:22: 0x10013fc5b in panicExtra__anon_18444 (build)
    std.builtin.panic(msg, trace, ret_addr);
                     ^
/Users/stephen/.local/bin/zig/lib/std/debug.zig:409:15: 0x10010cb47 in panic__anon_17783 (build)
    panicExtra(null, null, format, args);
              ^
/Users/stephen/.local/bin/zig/lib/std/Build.zig:1786:18: 0x1000dae4f in module (build)
            panic("unable to find module '{s}'", .{name});
                 ^
/Users/stephen/code/learn/zig/build.zig:54:46: 0x10009bc87 in build (build)
                .module = protobuf_dep.module("protobuf"),
                                             ^
/Users/stephen/.local/bin/zig/lib/std/Build.zig:1982:33: 0x10007ce37 in runBuild__anon_8404 (build)
        .Void => build_zig.build(b),
                                ^
/Users/stephen/.local/bin/zig/lib/build_runner.zig:310:29: 0x100078b3b in main (build)
        try builder.runBuild(root);
                            ^
/Users/stephen/.local/bin/zig/lib/std/start.zig:585:37: 0x10007e47b in main (build)
            const result = root.main() catch |err| {
                                    ^
???:?:?: 0x18c0e50df in ??? (???)
???:?:?: 0x56307fffffffffff in ??? (???)
error: the following build command crashed:
/Users/stephen/code/learn/zig/zig-cache/o/fca68244e4a64f1c935cc61c53853d6b/build /Users/stephen/.local/bin/zig/zig /Users/stephen/code/learn/zig /Users/stephen/code/learn/zig/zig-cache /Users/stephen/.cache/zig --seed 0x5b7babad -Z132fe72e70486721 --help

Tags can be deduced at compile time

in append_tag, the encoded tag value can be calculated at compile-time since the wire value and tag value are both known at compile time.

user error caused panic: integer cast truncated bits

Will look into this more tomorrow, but wanted to share quickly in case it's obvious:

thread 1951688 panic: integer cast truncated bits
/home/tj/.cache/zig/p/1220d226686f6023fae80156c729382295c672d6ffaa746d2932d73f16bd9affc758/src/protobuf.zig:634:77: 0x2dc131 in decode_varint__anon_12288 (a)
        value += (@as(T, input[index] & 0x7F)) << (@as(std.math.Log2Int(T), @intCast(shift)));
                                                                            ^
/home/tj/.cache/zig/p/1220d226686f6023fae80156c729382295c672d6ffaa746d2932d73f16bd9affc758/src/protobuf.zig:729:51: 0x2ca405 in next (a)
            const tag_and_wire = try decode_varint(u32, state.input[state.current_index..]);

Debug prints:

std.math.Log2Int(T): u5 | shift: 0
std.math.Log2Int(T): u5 | shift: 7
std.math.Log2Int(T): u5 | shift: 14
std.math.Log2Int(T): u5 | shift: 21
std.math.Log2Int(T): u5 | shift: 28
std.math.Log2Int(T): u5 | shift: 35 <---- crashes here for obvious reasons

My proto is essentially this:

message m {
  optional uint32 a = 1;
  optional float b = 2;
  message C {
    optional string d = 1;
  }
  optional C c = 3;
}

And I'm sending the following over the wire:

const data = try client_server.encode(allocator);
// data: [_]u8{ 8, 240, 1, 21, 0, 112, 155, 69 }

which looks like it should decode correctly to:

Byte Range Field Number Type Content
0-3 1 varint As Int: 240
As Signed Int: 120
3-8 2 fixed32 As Int: 1167814656
As Float: 4974

Hopefully I'm not doing something silly. Thanks for the library!

MessageMixins included in every struct, is this redundancy?

Take this file as an example:

syntax = "proto3";

package test_demo;
message Foo {
  string name = 1;
  repeated string loved = 2;
  int32 birth = 3;
}
protoc --zig_out=. foo.proto

This will output

// Code generated by protoc-gen-zig
 ///! package test_demo
const std = @import("std");
const Allocator = std.mem.Allocator;
const ArrayList = std.ArrayList;

const protobuf = @import("protobuf");
const ManagedString = protobuf.ManagedString;
const fd = protobuf.fd;

pub const Foo = struct {
    name: ManagedString = .Empty,
    loved: ArrayList(ManagedString),
    birth: i32 = 0,

    pub const _desc_table = .{
        .name = fd(1, .String),
        .loved = fd(2, .{ .List = .String}),
        .birth = fd(3, .{ .Varint = .Simple }),
    };

    pub usingnamespace protobuf.MessageMixins(@This());
};

It seems every struct contains a MessageMixins to provide decode/encode methods.

Since those methods are generics over T, We can do just let users do this with protobuf.pb_encode(struct, allocator), this may be beneficial for compile time?

Happy to hear what your thoughts, thanks.

Manage required fields through restrictions

Can use a method check_validity(self: T) InvalidErrors!void { ... } to ensure the message adheres to restrictions and options before encoding and after decoding.

Can be a first step for required fields on proto2

Error loading library

Hello, sabe problem here with zig 0.12.0-dev.3161+377ecc6af. It can't find protobuf module and also complains that the project does not have a build.zig.zon file so zig fetch --save does not work.

Originally posted by @davo417 in #37 (comment)

Incremental decoding / Streaming API

At a low level, the protobuf encoding allows you to decode one field at a time. But usually it is more convenient to have an API where you decode one message at a time.

To do this, most protobuf libraries will require the entire encoded message to be in contiguous memory before decoding it. This is fine in most cases, but sometimes you want to be able to decode messages incrementally. This saves on copying out to intermediate buffers.

Example use case in an embedded system, or a kernel driver:
Data is received into a ringbuffer or FIFO from some hardware peripheral. This buffer could be smaller than the size of an encoded protobuf message, and depending on the head/tail position, some messages may also wrap around from the end to the start of the buffer. Normally it would need to be defragmented/copied to an intermediate buffer before decoding, which takes more memory, or you need to resort to manually parsing one field at a time. In any case some manual logic / state machine is required.

Proposal: a "throw bytes at this function" API, where the library calls you back every time it has fully decoded a message.
The decoding state machine is generated by the library.
Example usage code:

// Only need to call this once. Possibly comptime.
var decode_state = protobuf.pb_decoder_init(.{ 
    .message_type = MyMessageType,
    .allocator = allocator,
    .delimiter_parser = protobuf.VarIntDelimiterParser, // Library provided implementation, or could be a custom implementation
    .decode_callback = message_decoded_callback,
});
// Somewhere else in the codebase: we receive some bytes.
// This is a slice with any number of bytes, i.e. half a message, or three entire messages.
// message_decoded_callback is called whenever a MyMessageType is fully decoded,
// across any number of calls to pb_decode_bytes.
// This function should only use a buffer large enough to decode one field.
// It only calls the allocator when decoding dynamic data like strings or repeated fields.
try protobuf.pb_decode_bytes(&decode_state, bytes_from_somewhere());
fn message_decoded_callback(decoded: *const MyMessageType) void {
    defer decoded.deinit();
    // do stuff with decoded message 
}

I am not familiar enough with zig-protobuf internals to know how hard this to implement, but I imagine it would be a good chunk of code. From an API design perspective the hard part is going to be the delimiter parser, which would be some interface to parse the delimiters/headers. Ideally it would be flexible enough that you can implement things like checksums/CRC with it as well.

A streaming encoder would be nice to have as well, but it is not as important in my opinion.

I ended up typing out quite the story here, but it is just some draft/idea, I understand if it is outside of the scope of this project.

Start code generator

Take inspiration from nanopb's python generator

Aim for varint and non-plugin way first.

discussion: Managed strings

As of today, the library is leaking memory for user-allocated strings.

Tests are not failing because static/const strings are not accounted as leaks.

This could be fixed by recursively deallocating strings in the .deinit funciton but two problems may arise:

  1. It should be extremely clear for the user that the ownership of string memory is now passed onto the proto message. Plus, the allocator for the string MUST be exactly the same as the one used to deinitialize the message.
  2. Deallocating const strings may result in runtime exceptions

Proposal:

Replace the plain []const u8 by ManagedString. Keep the ?ManagedString for optional semantics. ManagedString is an union with OwnedString and ConstString, the former is automatically deallocated, the later won't.

const ManagedStringKeys = enum {
  OwnedString,
  ConstString
};

const ManagedString = union(ManagedStringKeys) {
  // memory that needs to be allocated
  OwnedString: []const u8,
  // memory that is static/const
  ConstString: []const u8,
};

Reduce scope for 0.9.0 release

I'm currently unable to find a solution for the OneOf field. Just axe it for a 0.9.0 release and make everything work a bit.

Maybe i should just leave the idea of comptime optimisation for OneOf and have a simpler process?

Everything else works.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.