sharksforarms / deku Goto Github PK

View Code? Open in Web Editor NEW

1.1K 8.0 54.0 850 KB

Declarative binary reading and writing: bit-level, symmetric, serialization/deserialization

License: Apache License 2.0

Rust 99.36% HTML 0.64%

rust rust-crate serialization deserialization parse encoder-decoder bits bytes declarative symmetric

deku's People

Contributors

Stargazers

Watchers

deku's Issues

Add `default` attribute

Defaults the member to the default of the type and skips reading

Test against upcoming `bitvec` release

Test against bitvec develop branch:

https://twitter.com/bitvec_rs/status/1290692818353127426

Consolidate BitsReaderItems trait into BitsReader

count is the only difference and could be added to BitsReader::read as an Option<usize>

Ensure the crate is web assembly ready

Vec type read/write

pub struct MyStruct {
  ext_len: usize,
  #[deku(len_field= "ext_len")]
  extensions: Vec<Extension>,
}

Allow something like this and update the ext_len before dumping bits to acc

Context of enum `id_type` cannot be utilized

For example, passing the top-level endian down to it's child

#[deku(endian = "big")]
struct Parent {
   child: Child
}

#[deku(id_type = "u16", ctx = "_endian: deku::ctx::Endian")] // will default to system endianess, no way to use ctx endian
enum Child {
   Variant
}

Rename BitsWriter and BitsReader

I feel like another name would be better suited, possibly matching the proc-macro's name if that's the convention? DekuRead DekuWrite

Allow pattern in `id` attribute

For example:

enum Foo {
    #[deku(id = "0..=9")]
    A(u8)
    #[deku(id = "id if id > 9")]
    B(u8, u8)
}

Implement BitsSize, BitsRead and BitsWrite on the struct itself

For composability, it would be nice to do something like the following:

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct FieldB {
    #[deku(bits = "6")]
    data: u8,
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct DekuTest {
    #[deku(bits = "2")]
    field_a: u8,
    field_b: FieldB
}

Add CI

https://github.com/actions-rs/meta/blob/master/recipes/quickstart.md

Cleanup enum attribute names

I can't think of a reason why there's both id_bits and bits, for example there's endian but not id_endian (we use endian for enum)

This would be more consistent with field/structs

Also rename id_type to type and id to value

Before:

#[deku(id_type = "u8", id_bits = "5")]
enum Test {
    #[deku(id = "0x01")]
    VarA,
}

After

#[deku(type = "u8", bits = "5")]
enum Test {
    #[deku(value = "0x01")]
    VarA,
}

TokenStream attributes should take a function ident

Instead of specifying arbitrary code in these attributes, only accept a function ident (like serde default) and document the function prototypes for each

Enum attribute improvements

Current enum behavior

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_type = "u8")]
enum Packet {
    #[deku(id = "0x00")]
    Zero,
    #[deku(id = "0x01")]
    One,
    #[deku(id = "0x02")]
    Two,
    #[deku(id = "0x03")]
    Three,
    #[deku(id = "0x04")]
    Four,
}

attribute inherit which would inherit the id from the value already assigned.

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_type = "u8", inherit)]
enum Packet {
    Zero = 0x00,
    One =  0x01,
    Two = 0x02,
    Three = 0x03,
    Four = 0x04,
}

attribute ordered which would take the first element and increase the id value for each value after that one.

Maybe this would increase for every enum field that didn't have an id defined.

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_type = "u8", ordered = "0x00")]
enum Packet {
    Zero,
    One,
    Two,
    Three,
    Four,
    #[deku(id = "44")]
    FourtyFour,
}

Add `count` attribute

A fixed number of elements to be read

Example:

struct Test {
    #[deku(count = 2)]
    data: Vec<u8>
}

Possibly also rename len attribute to count_field ?

Impove derive macro error message

Currently when an error happend in deive macro it will just panic, we should use syn::Error::to_compile_error() instead.
For example, an invalid attribute give this message:

error: proc-macro derive panicked
 --> src\main.rs:3:10
  |
3 | #[derive(DekuRead)]
  |          ^^^^^^^^
  |
  = help: message: called `Result::unwrap()` on an `Err` value: Error { kind: UnknownField(ErrorUnknownField { name: "id", did_you_mean: None }), locations: ["b"], span: Some(#0 bytes(79..86)) }

A better message could be:

error: unknown deku field attribute `id`
 --> src\main.rs:3:10
|
7| #[deku(id = "")]
|         ^^

Add `map` attribute

Allow running a function of the read value

Examples:

struct SomeStruct {
    #[deku(map= "|f: u8| f.to_string()")]
    field_a: String,
}

fn map_string(f: u8) -> String {
    f.to_string()
}
struct SomeStruct {
    #[deku(map= "map_string")]
    field_a: String,
}

Edit:
Can do something with trait calls like so:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=9a711662336e1b0059e40947250b05ff

https://doc.rust-lang.org/book/ch19-03-advanced-traits.html#fully-qualified-syntax-for-disambiguation-calling-methods-with-the-same-name

add top-level enum attribute `id`

Currently I don't see a way to use ctx as the bytes to parse an enum (instead of reading bits/bytes). The key part is the category and length need to be read before the Messages are parsed. The category field is used to decide which struct from an enum is parsed.

Example

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct AsterixPacket {
    #[deku(bytes = "1", endian = "big")]
    category: u8,
    #[deku(bytes = "2", endian = "big")]
    length: u16,
    #[deku(ctx = "*category")]
    messages: Vec<Message>,
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_ctx = "category")]
enum Message {
    #[deku(id = "48")]
    Cat48(Cat48),
}

Dependabot can't parse your Cargo.toml

Dependabot couldn't parse the Cargo.toml found at /Cargo.toml.

The error Dependabot encountered was:

Dependabot::DependencyFileNotParseable

View the update logs.

Add `ctx_default` attribute

Add a ctx_default attribute to allow for containers which can both take a context, or default to a fixed context

#[derive(PartialEq, Debug, DekuRead, DekuWrite)]
#[deku(ctx = "a: u8, b: u8", ctx_default = "1, 2")]
pub struct TopLevelCtxStructDefault {
    #[deku(cond = "a == 1")]
    pub a: Option<u8> 
    #[deku(cond = "b == 1")]
    pub b: Option<u8>,
}

#[test]
fn test_ctx_default_struct() {
    let expected = samples::TopLevelCtxStructDefault {
        a: Some(0xff),
        b: None,
    };

    let test_data = [0xffu8];

    // Use default
    let ret_read = samples::TopLevelCtxStructDefault::try_from(test_data.as_ref()).unwrap();
    assert_eq!(expected, ret_read);
    let ret_write: Vec<u8> = ret_read.try_into().unwrap();
    assert_eq!(ret_write, test_data);

    // Use context
    let (rest, ret_read) =
        samples::TopLevelCtxStructDefault::read(test_data.bits(), (1, 2)).unwrap();
    assert!(rest.is_empty());
    assert_eq!(expected, ret_read);
    let ret_write = ret_read.write((1, 2)).unwrap();
    assert_eq!(test_data.to_vec(), ret_write.into_vec());
}

Impl over IpAddr, Ipv4Addr and Ipv6Addr

Use AsRef instead of &[u8]

Add support for unused/padding bits

It would be nice if there was a way to skip over bits or bytes without creating dummy fields in order to save on space e.g :

pub struct SomeStruct {
    field_01: u8,

    // This field is useless but is needed for proper read/write
    unused01: u8,

    field_02: u8,

    // This field is useless but is needed for proper read/write
    unused02: u8,
}

maybe some skip_[bytes|bits] that could be added before and after fields in structs :

#[deku(skip_bytes="1")]

Print ident name in "Could not match enum variant" Err

In the following line:

deku-derive/src/macros/deku_read.rs:175:                return Err(DekuError::Parse(format!("Could not match enum variant id = {:?}", variant_id)));

It would be nice to print out the ident name (so the name of the Enum) for easier troubleshooting.

I would do it, but I can't for the life of me figure out how to print out the ident. New to proc_macros/quote

Implement custom writers

Users can implement custom readers but not custom writers

Make no_std compatible

Merged in #37

Allow types that dont implement DekuRead / DekuWrite when a custom parser is provided

struct DekuTest {
    field_a: u8,
    #[deku(reader = "bytes_to_str(rest)")]
    field_b: String,
}

You should not need to implement anything on String as the custom reader handles the parsing

Attribute to handle length of bytes(or bits) in Vec<T> with other struct field

Consider the following code:

use deku::prelude::*;

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
pub struct Packet {
    #[deku(bytes = "1")]
    length: u8,
    // byte len of all of messages is length - 2
    messages: Vec<Message>,
}

/// In the real packet, this would be of variable length, so we can't use just the `count` attribute on messages
#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
pub struct Message {
    #[deku(bytes = "1")]
    msg: u8,
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test01() {
        let data: Vec<u8> = vec![0x04, 0x01, 0x02];

        let (_, value) = Packet::from_bytes((data.as_ref(), 0)).unwrap();

        assert_eq!(
            Packet {
                length: 0x04,
                messages: vec![Message { msg: 0x01 }, Message { msg: 0x02 }]
            },
            value
        );
    }
}

I can use the count attribute to give the length of Vec<T> elements, but I see no way of telling the max bytes that a Vec can have in it's container as a whole. Wondering if this would be a feature, or do I need to do some weird custom write/read implementation with a read_bytes field.

Rename `to_bitvec` to `to_bits`

Why a function convert a type to bytes (Vec<u8>) is to_bytes but a function convert type to bits(BitVec<Msb0, u8>) is to_bitvec? why not to_bits?

deku/src/lib.rs

Lines 251 to 255 in c7e0377

 /// Write struct/enum to Vec<u8> 

 fn to_bytes(&self) -> Result<Vec<u8>, DekuError>; 

 /// Write struct/enum to BitVec 

 fn to_bitvec(&self) -> Result<BitVec<Msb0, u8>, DekuError>;

Differing impls for bits and bytes

If bytes attribute is used and the index is on a byte boundary, it may be quicker to read from a slice of &[u8] instead of reading 8*n bits.

I'd like for more benchmarks to be written before so this optimization can be measured

One option could be to feature flag the bits/bytes attributes

Update func should be in it's own trait

Add support for condition and context

Hey, I'm trying to write a simple binary parser with Deku, and here is two problems I found.

Condition

I read your source and found i can pass whatever arguments to it. Sorry about it

Context

Think this binary structure:

struct Data {
    a: u8,
    // This field depends on `Header.flag`
    b: Option<u8>
}
struct Bin {
    flag: u8,
    data: Vec<Data>
}

Because the lack of context, I cant find any way to parse it except writing a custom reading function manully. By the way, add context supportting is a little complicated, I still dont know what the best way is.
Overall, thanks for your great crate.

Implement DekuRead, DekuWrite for String

Maybe something like this?

use deku::prelude::*;

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct Packet {
    s: String,
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test01() {
        //                      [len  | string               ]
        let data: Vec<u8> = vec![5, 104, 101, 108, 108, 111];

        let (_, value) = Packet::from_bytes((data.as_ref(), 0)).unwrap();

        assert_eq!(
            Packet {
                s: "hello".to_string(),
            },
            value
        );
    }
}

(len would be an u64 in a real example)

Conditional reading

Allows for conditional field reading dependent on the return of a lambda

fn my_condition(input: &[u8], index: usize) {

    // TODO: somehow get access to field_a ?
    // if (field_a == 0xAB) {
    //    return true;
    // }

    return false;
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct DekuTest {
    field_a: u8,
    #[deku(bits = "7", read_if=my_condition)]
    field_b: Option<u32>,
}

Not sure the best way to give lambda access to the previously parsed fields

No readme

Why `from_bytes` need `bit_offset`

deku/src/lib.rs

Line 235 in c7e0377

 fn from_bytes(input: (&[u8], usize)) -> Result<((&[u8], usize), Self), DekuError> 

Why not from_bytes(bytes: Bytes) and from_bits(bits: Bits)? Why do I need to care about which bit the byte start from when I use a function called from_bytes?

Restrict reader and writer to certain variables, not all internals

readers/writers should have access to:
rest, struct variables, final attribute variables (bit size, input_is_le)

The provided function can be run in a function sandbox where the needed variables are passed with documented names: i.e.

let variant_read_func = if variant_reader.is_some() {
    fn sandbox_reader(rest:, input_is_le:, field_a: field_b:) {
        quote! { #variant_reader; }
    }
    sandbox_reader(rest, input_is_le, field_a, field_b);
} 
...

Benchmarks and performance

Using criterion

+bonus if compared to other crates like nom for reading

BitsWriter::write should return a Result

This will allow us to remove the unwrap in emit_field_update

Render generics in proc-macro

Branch created, but is stale... rebase to master

Endian-ness composing

I believe it would be a nice features for child structs and enums to inherit the parents endian type.

Currently the following code produces the following error:

 `deku::DekuRead<deku::ctx::Endian>` is not implemented for B

use deku::prelude::*;

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(endian = "big")]
struct Packet {
    len: u16,
    messages: B,
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_type = "u8")]
enum B {
    #[deku(id = "0x00")]
    one,
    #[deku(id = "0x01")]
    two,
    #[deku(id = "0x02")]
    three,
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test01() {
        let data: Vec<u8> = vec![0x04, 0x13, 0x01];

        let (_, value) = Packet::from_bytes((data.as_ref(), 0)).unwrap();

        assert_eq!(
            Packet {
                len: 0x0413,
                messages: B::two,
            },
            value
        );
    }
}

In fact, the way of creating a compiling version of this code seems a bit odd. As I only add endian to the len field.

use deku::prelude::*;

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct Packet {
    #[deku(endian = "big")]
    len: u16,
    messages: B,
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_type = "u8")]
enum B {
    #[deku(id = "0x00")]
    one,
    #[deku(id = "0x01")]
    two,
    #[deku(id = "0x02")]
    three,
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test01() {
        let data: Vec<u8> = vec![0x04, 0x13, 0x01];

        let (_, value) = Packet::from_bytes((data.as_ref(), 0)).unwrap();

        assert_eq!(
            Packet {
                len: 0x0413,
                messages: B::two,
            },
            value
        );
    }
}

Improve deku attribute documentation

Document each proc-macro attribute and how they're used

Error on TryFrom impls in not all input is consumed

from_bytes returns (rest, value) while try_from impls should return Error if not all input is consumed

assert_eq/assert attributes

It's useful when a prototype has a magic header(e.g. zlib, pyc, jpg) or a field has a limit.

struct Foo {
    #[deku(assert_eq("[0xAA, 0xBB, 0xCC]"))]
    magic: [u8; 3]
    #[deku(assert("a >= 128"))]
    a: u8,
}

Compile time bit_size checking

deku/deku-derive/src/macros/deku_read.rs

Line 79 in 26eccca

// TODO: Can this somehow be compile time?

This seems like it could be fixed by using const functions. std::mem::size_of() is const which can all be evaluated at compile time. e.g. :

    const fn bit_size() -> usize {
        std::mem::size_of::<$typ>() * 8
    }

`update` attribute

gets called when the struct is .update()'d, kinda like the len attribute but provides a custom impl

Example:

pub struct Ipv4 {
    ....
    #[deku(update = "calc_checksum(...)")]
    pub checksum: u16,       // Header checksum
    ....
}

`option_as_tokenstream` will eat span

Since option_as_tokenstream uses Option<String> as its input, darling will discard the span while parsing(because String doesn't have a span). It makes error message hard to read.
For example:

#[derive(DekuRead)]
struct Foo {
    #[deku(cond = "'a' == 2")]
    a: u8,
}

error:

error[E0308]: mismatched types
  --> src\main.rs:24:10
   |
24 | #[derive(DekuRead)]
   |          ^^^^^^^^ expected `char`, found `u8`
   |
   = note: this error originates in a derive macro (in Nightly builds, run with -Z macro-backtrace for more info)

replace it with LitStr:

error[E0308]: mismatched types
  --> src\main.rs:26:19
   |
26 |     #[deku(cond = "'a' == 2")]
   |                   ^^^^^^^^^^ expected `char`, found `u8`

Conditional Skip

Great idea of a library.

I have a protocol that needs conditional parsing of fields in a struct. I see the skip attribute, but would something like a conditional skip be possible?

use deku::prelude::*;
use std::convert::TryFrom;

#[derive(PartialEq, Debug, DekuRead, DekuWrite)]
pub struct DekuTest {
    pub field_a: u8,
    #[deku(skip, if=(field_a, 1))]
    pub field_b: Option<u8>,
    #[deku(skip, if=(field_b, Some(1)))]
    pub field_c: Option<u8>,
}

fn main() {
    let data: Vec<u8> = vec![0x01, 0x02];

    let value = DekuTest::from_bytes((data.as_ref(), 0)).unwrap();
    println!("{:#?}", value)
}

Improve examples

Find some good examples for the README/lib.rs landing page
- Showcase struct, enums, vec, custom reader/writer

DekuWrite write() should take &mut BitVec

Instead of having an allocation per field, write should take a mut bitvec by ref to extend.

	/// Write struct/enum to Vec<u8>
	fn to_bytes(&self) -> Result<Vec<u8>, DekuError>;

	/// Write struct/enum to BitVec
	fn to_bitvec(&self) -> Result<BitVec<Msb0, u8>, DekuError>;

sharksforarms / deku Goto Github PK

deku's People

Contributors

Stargazers

Watchers

Forkers

deku's Issues

Example

Condition

Context

Recommend Projects

Recommend Topics

Recommend Org