Git Product home page Git Product logo

cbor's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cbor's Issues

Replace options struct and improve option handling

It would be easier and cleaner to do this in v2.0 instead of v1.4 due to SemVer policy.

RFC 7049bis indicates additional considerations for encoding not mentioned in 7049, so future protocols are likely to use new encoding modes.

One way to handle options would be to use some integer "enums" to specify different aspects of encoding modes. So 6-7+ of these aspects can combine to specify any current or future mode. And each is an integer so they can have new values made available as needed.

Something like the following (with better names than this rough draft):

  1. Sorting: 0=default, unsorted; 1=RFC 7049 Canonical, 2=Bytewise Lexicographic, 3=reserved, ...
  2. SmallestFormFloats: 0=default, unchanged; 1=float16, 2=float32, 3=float64, 4=reserved, ...
  3. SmallestFormIntegers: 0=default, smallest form is byte; 1=reserved, ...
  4. DupMapKeys: 0=default, unchecked, last one used, continue; 1=reserved, ...
  5. Utf8Problems: 0=default, reject as malformed and stop; 1=reserved, ...

And one or more aspects for each of these:

  • Other error-handling options 7049bis requires protocols to specify (so they can use this library).
  • Tag options like whether or not to encode tag for time values.

So a combination of these options can handle all existing CBOR modes plus some future modes that don't exist yet.

how to save one struct has pointer and embedded map with cbor

I have two struct like this

type IdsMapping struct {
UserIdIndexDict map[float64]float64
IndexUserIdDict map[float64]float64
ItemIdIndexDict map[float64]float64
indexItemIdDict map[float64]float64
}

type LFM struct {
classCount int
iterCount int
featureName string
labelName string
lr float64
lam float64
UserItemRatingMatrix *mat.Dense
ModelPfactor *mat.Dense
ModelQfactor *mat.Dense
RatingDf *dataframe.DataFrame
UseridItemidDict map[float64]map[float64]float64
UseridSet *gset.Set
ItemidSet *gset.Set
IdsMapping *IdsMapping
UserIndexItemIndexDict map[float64]map[float64]float64
}

I want to use cbor to serilazition LFM struct ,how to use it ,because the struct is complex ,thx

Support `encoding.[BinaryMarshaler,BinaryUnmarshaler]`?

Is there any interest in supporting special handling for types that know how to serialize themselves into a binary representation? encoding/json supports calling a given object's encoding.TextMarshaler and encoding.TextUnmarshaler implementations if provided, as does the go-codec library.

For CBOR the binary equivalents probably make more sense. I've started poking at this in a local branch, and while the encoding side was straightforward to add, the decoding side is a bit more involved due to parseInterface.

Decoding subnormal float16 numbers produces incorrect values

When float16 numbers are subnomal (exponent 0, significand โ‰  0), decoded values are incorrect.

This problem only affects subnormal float16 numbers during decoding.

Fix this by replacing float16 to float32 conversion function with a new implementation that is verified 100% correct for all 65536 possible conversions.

Encoding struct with null pointer to embedded struct returns error

Encoding struct with null pointer to non-embedded struct works as expected.

Problem only affects null pointer to embedded struct.

What version?
go v1.12.12
cbor v1.2.0 (likely affects older versions too)

What did you do?

	type (
		T1 struct {
			N int
		}
		T2 struct {
			*T1
		}
	)
	v := T2{}
	cborData, err := cbor.Marshal(v, cbor.EncOptions{}

What did you expect to see?

cborData = []byte{0xa0}
err = nil

What did you see instead?

cborData = nil 
err = "cbor: cannot set embedded pointer to unexported struct: *cbor_test.T1"

Safe optimization for COSE (RFC 8152) and WebAuthn

It would be nice to be able to decode CBOR maps with integer keys to Go struct, and vice versa.

This feature would speed up and simplify using this library for COSE. It would also speed up WebAuthn since that uses COSE.

Publish old vs new benchmark deltas for new releases

This would help users of older versions evaluate new releases. And help prevent undetected performance regressions.

benchmark old ns/op new ns/op delta
BenchmarkFoo 523 68.6 -86.88%
benchmark old allocs new allocs delta
BenchmarkFoo 3 1 -66.67%
benchmark old bytes new bytes delta
BenchmarkFoo 80 48 -40.00%

Update CBOR comparison charts for v1.3.3 and simplify them

v1.3.3 looks solid, passed 220+ million execs fuzzing with 1000+ corpus files as starting point.

workers: 2, corpus: 1071 (18h46m ago), crashers: 0, restarts: 1/10000, execs: 227582949 (3368/sec), cover: 2011, uptime: 18h46m

The charts don't need 3 bars for each aspect being compared. Just use default build settings for every library compared for every comparison.

I'll post updated speed comparison and size comparison charts as comments to this issue.

Rename milestone v1.4 to v2.0 (tags and 7049bis options)

Timing is good to bump major version since adding CBOR Tags is a big change and there's a need for extended options based on 7049bis. E.g. newer encoding modes and error handling options.

As discussed, renaming v1.4 to v2.0 allows extended option handling to be done in a cleaner and simpler way without dragging along the old options struct.

Improved option handling can be designed with 7049bis and "generic" CBOR library in mind, so it should be more future-proof than the current design.

Comply with RFC 2026 about RFC 7049bis

RFC 2026 says,

Under no circumstances should an Internet-Draft be referenced by any paper, report, or Request-for-Proposal, nor should a vendor claim compliance with an Internet-Draft.

Update README.md:

  1. Remove 7049bis from statements about standards compliance until it is approved.
  2. When mentioning 7049bis for extras beyond compliance, add the word "latest" in front of it. E.g. "Decoding also checks for all required well-formedness errors described in the latest RFC 7049bis, ..."

Split encoding mode booleans into two integer options in v1.x

As discussed in #62 this can be a small release in v1.x before merging in CBOR tags feature.

  1. deprecate but still support boolean encoding modes in v1.4
  2. add SortMode (int) and ShortestFloat (int) as encoding options to replace encoding mode booleans.

The combination of SortMode and ShortestFloat can be used to specify these modes in v1:

  • default
  • Canonical
  • CTAP2
  • Core Deterministic Encoding Rule 2 (7049bis)
  • and modes that don't have a name yet
type SortMode int

const (
	SortNone			SortMode = 0	// no sorting
	SortLengthFirst			SortMode = 1	// RFC 7049 Canonical
	SortBytewiseLexical		SortMode = 2	// RFC 7049bis Bytewise Lexicographic
	SortCanonical			SortMode = SortLengthFirst
	SortCTAP2			SortMode = SortBytewiseLexical	
	SortCoreDeterministic		SortMode = SortBytewiseLexical
)
type ShortestFloat int

const (
	ShortestFloatNone		ShortestFloat = 0	// no change
	ShortestFloat16			ShortestFloat = 1	// float16 as shortest form of float that preserves value
	ShortestFloat32			ShortestFloat = 2	// float32 as shortest form of float that preserves value
	ShortestFloat64			ShortestFloat = 3	// float64 as shortest form of float (this may convert from float32 to float64, etc.)
)

Add 87 tests for CBOR data items that are not well-formed

RFC 7049bis describes three kinds of malformed CBOR data:

  • kind 1 (too much data)
  • kind 2 (too little data)
  • kind 3 (syntax error) and 5 "subkinds" of syntax error

I created 87 unit tests based on kind 2 and kind 3 and v1.3.2 passed all of them. ๐Ÿ‘

Kind 1 is only an error when the application (not CBOR library) assumes that the input bytes would span exactly one data item.

RFC 7049bis Appendix G:

Too much data: There are input bytes left that were not consumed. This is only an error if the application assumed that the input bytes would span exactly one data item. Where the application uses the self-delimiting nature of CBOR encoding to permit additional data after the data item, as is for example done in CBOR sequences [I-D.ietf-cbor-sequence], the CBOR decoder can simply indicate what part of the input has not been consumed.

Need to review the tests and commit.

Add a Security Policy to README.md

For example:

Security Policy

For v1, security fixes are provided only for the latest released version since the API won't break compatibility.

To report security vulnerabilities, please email [email protected] and allow time for the problem to be resolved before reporting it to the public.

Decoding to not-nil interface returns error

The following code returns error "cbor: cannot unmarshal array into Go value of type interface {}"

s := "hello"
var v interface{} = s
cbor.Unmarshal(data, &v)

Decoder should handle not-nil interface in the same way as nil interface by storing appropraite Go type in the interface value. This behavior should be made consistent with encoding/json.

Deprecate bool encoding modes in EncOptions and provide int SortMode

Encoding modes have different aspects, for example:

  • sorting
  • if (and how) to shrink floats to smallest form that preserves value

The encoding mode bools should've been one int since there will be more encoding modes in the future than anticipated.

However, providing an integer encoding mode is also inflexible because it would only support known encoding modes.

Simply deprecate the bool encoding modes and add an integer SortMode to EncOptions.

When using encoding modes that do not shrink/expand floats, the sort mode basically determines simple encoding modes like Canonical or CTAP2 Canonical.

This way, future encoding modes not yet known/named today can be supported by setting the required combination of options in EncOptions.

This issue is closed by commit 3b78ee0 and is the first half of existing issue #74.

A separate issue can be opened to add a NewEncOptions or EncOptions.New function (specifying a known encoding mode) that returns EncOptions having proper values for SortMode, ShortestFloatMode, and etc.

type SortMode int

const (
	SortNone			SortMode = 0	// no sorting
	SortLengthFirst			SortMode = 1	// RFC 7049 Canonical
	SortBytewiseLexical		SortMode = 2	// RFC 7049bis Bytewise Lexicographic
	SortCanonical			SortMode = SortLengthFirst
	SortCTAP2			SortMode = SortBytewiseLexical	
	SortCoreDeterministic		SortMode = SortBytewiseLexical
)

Go struct to CBOR array using `cbor:",toarray"`

This allows:

  • encoding Go strcut to CBOR array.
  • decoding CBOR array to Go struct.

This makes encoded data more compact and structs are easier to use.

	type T struct {
		_ struct{} `cbor:",toarray"` 
		A int      `cbor:",omitempty"`
		B string   `cbor:",omitempty"`
	}

Special field "_" is used to specify struct level options, such as "toarray". Any value of T type is encoded as array of 2 elements. "omitempty" is disabled by "toarray" to ensure that the same number of elements are encoded every time.

If "toarray" is omitted, "omitempty" works just like encoding/json.

I came across this at oasisprotocol/oasis-core@ade6a1b. @Yawning has the best commit messages!

Separate CTAP2 "canonical CBOR" and RFC 7049 "Canonical CBOR" encoding modes

CTAP2 "canonical CBOR" != RFC 7049 "Canonical CBOR" when map keys have different data types due to sorting rules. If all map keys have same data type, there's no difference.

When "Canonical CBOR" option is specified, this library has been using sorting rules from "Core Deterministic Encoding Requirements" making it effectively sort like CTAP2.

There should be two canonical modes as options: "Canonical CBOR" and "CTAP2 Canonical CBOR". They only differ in sorting rules involving map keys with mixed types.

NOTE: issue edited based on your comments

Replace example CBOR program in size comparison chart

Program size comparison should use existing 3rd-party program(s), instead of one you wrote for solely for comparison.

If you want, I'll give it a shot and post result(s) here. I'll find IoT and security related projects because they value size, safety and reliability.

Let me know.

Improve speed and memory usage

Additional performance improvements for milestone v1.3, that is not already covered by issue #15 and #17.

  • Improve encoding speed, reduce mem, refactor (commit 8ea465d)
  • Improve encoding speed (commit d85552b)
  • Improve encoding speed by caching types (commit 90423eb)
  • Add fast path to encode fixed length struct (commit 05e6b7c)
  • Add Fast path to decode to empty interface (commit 23d2052)

And more if time allows.

Add CBOR encoding/decoding speed comparison chart to README.md

I'll provide a comparison using release versions of CBOR libraries. The input data won't be contrived/biased so it'll be fair and useful.

README says speed isn't a primary design goal and that faster libraries exist, so I think the results will surprise some people.

Unsupported CBOR negative int values should return error

As documented under Limitations section, CBOR negative integers like -18446744073709551616 are unsupported because they cannot fit into Go's int64.

But instead of returning 0 with err=nil, I think cbor.Unmarshal should return an error when trying to decode these values.

Go's encoding/json handles this scenario by returning json.UnmarshalTypeError. So this library should probably return cbor.UnmarshalTypeError.

Improve decoding speed

Spotted several optimizations while working on milestone v1.3.1.

Speedups should be around 4% or 5% faster for COSE and CWT decoding depending on other changes in the same release.

Add "Core Deterministic Encoding" mode

When user specifies "Core Deterministic Encoding" mode, this library should:

  1. encode floating point values using the shortest form that preserves that value
  2. sort using streamlined rules that produce same results as "CTAP2 Canonical CBOR" sorting of map keys
  3. no change needed for integers since they're already using shortest form for all modes

For example, try to convert float64 to float32, if that works, try to convert float32 to float16.

This mode makes serialized data more compact when it contains floating point numbers.

Reject indefinite length byte string if chunks are indefinite length

Enforce this in data validation.

RFC 7049 2.2.2

   For indefinite-length byte strings, every data item (chunk) between
   the indefinite-length indicator and the "break" MUST be a definite-
   length byte string item; if the parser sees any item type other than
   a byte string before it sees the "break", it is an error.

This also affects indefinite length text string.

CBOR CWT and SenML decoding examples using "keyasint", "toarray" struct tags

Add this to Usage section today? Last code-related commit has been fuzzing nonstop for days without single crash and speed is fast with this feature!

How to Decode SenML with fxamacker/cbor v1.3

// RFC 8428 says, "The data is structured as a single array that 
// contains a series of SenML Records that can each contain fields"

// fxamacker/cbor v1.3 has "keyasint" struct tag (ideal for SenML)
type SenMLRecord struct {
	BaseName    string  `cbor:"-2,keyasint,omitempty"`
	BaseTime    float64 `cbor:"-3,keyasint,omitempty"`
	BaseUnit    string  `cbor:"-4,keyasint,omitempty"`
	BaseValue   float64 `cbor:"-5,keyasint,omitempty"`
	BaseSum     float64 `cbor:"-6,keyasint,omitempty"`
	BaseVersion int     `cbor:"-1,keyasint,omitempty"`
	Name        string  `cbor:"0,keyasint,omitempty"`
	Unit        string  `cbor:"1,keyasint,omitempty"`
	Value       float64 `cbor:"2,keyasint,omitempty"`
	ValueS      string  `cbor:"3,keyasint,omitempty"`
	ValueB      bool    `cbor:"4,keyasint,omitempty"`
	ValueD      []byte  `cbor:"8,keyasint,omitempty"`
	Sum         float64 `cbor:"5,keyasint,omitempty"`
	Time        int     `cbor:"6,keyasint,omitempty"`
	UpdateTime  float64 `cbor:"7,keyasint,omitempty"`
}

// When cborData is a []byte containing SenML, 
// it can easily be decoded into a []SenMLRecord.

var v []SenMLRecord
if err := cbor.Unmarshal(cborData, &v); err != nil {
	t.Fatal("Unmarshal:", err)
}

// That's it!  Decoding speed is fast and v contains easy to use SenML Records.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.