Comments (2)
Proposal: can we broaden this issue to discuss what variable-length integer encoding in general would be most appropriate?
I was unfamiliar with zigzag encoding and what advantages it would have over the current method, so I searched for "leb128 vs zigzag". I found these two related discussions about LEB128 being used in WASM, both of which raised a number of points that seemed relevant to the topic:
https://news.ycombinator.com/item?id=11263378
A quick summary of what stood out to me:
- zigzag encoding/decoding is allegedly faster for signed integers than plain signed LEB128
- implementations for prefix-based varints are generally 2x-3x faster than LEB128
- LEB128 allows for locating the start and end with random access, prefix-based varints don't
- In the context of UTF8, having non-canonical representations has lead to security issues.
- Technically
LEB128
also lacks canonical encodings for numbers: zero for example would typically be encoded as00
, but technically80 00
,80 80 00
, ...80 80 ... 80 00
also work. Should this be forbidden? (in practice this means disallowing LEB128 to have trailing80
s followed by00
for positive values, and similarlyFF
s +7F
for negative values. Can't think of the top of my head if this applies to zigzag encoding)
- Technically
I have no strong opinions on any of this, and I do not know what the most relevant criteria are to decide on a varint encoding for muon anyway. But these points they seemed relevant enough to bring up (even if muon sticks to plain LEB128 you can at least say that you looked at alternatives, right?)
PS: for anyone else unfamiliar with zigzag encoding, this blog and this SO question helped me out
from muon.
I think i'll give it another try
from muon.
Related Issues (20)
- Deterministic encoding HOT 3
- Question regarding numbers being passed between JS and Python versions and what it means for their types HOT 2
- Are 8C tagged strings lenght encoded? HOT 1
- Request for clarification of how the 8C tag interacts with lists HOT 2
- Should LRU cache apply to arbitrary objects, not only strings? HOT 2
- Explanation of what tags actually are HOT 4
- Handling of duplicate keys in dicts HOT 2
- Could we chain muon on-the-wire data? HOT 16
- Resynchronization HOT 8
- Add support for `unums/posits`, a new floating point format
- Use `size` tag for fixed-length strings
- PSON and IOTMP libraries HOT 1
- Standard text representation
- Serialization/deserialization HOT 1
- use cases ?
- A list of libraries in other languages?
- Allow int's to be dictionary keys HOT 5
- Handling of strings with nul bytes in them HOT 8
- Are we using signed or unsigned LEB128? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from muon.