Comments (10)
What your objective is
Given an already frozen Nippy serialization held in an existing bytebuffer, I'm looking for a capability to parse through and extract some of the inner contents without actually thawing any of the values (to avoid allocations for objects that aren't strictly needed for anything), such that I can work with additional bytebuffers that hold wrapped slices of still-frozen nested values (i.e. not copying the underlying bytes either). For example, given an already frozen map, I would like to be able to locate and (potentially later) thaw only a specific value under a known key, if that key exists in that map, as demonstrated here.
How this relates to the current issue re: support for freezing to a user-supplied bytebuffer
I can't speak for the Datalevin project but I believe the overall goals are somewhat similar: a bytebuffer API would allow for memory to be re-used in tight loops and avoid creating unnecessary garbage. I can imagine that the initial scope of this issue for thawing might only require thawing from an entire buffer at a time, but I need something slightly more specific in addition, which is to be able to parse without thawing, so that I can later decide exactly which inner values I would like to thaw, if any. I am not currently looking for support to freeze to a user-specified bytebuffer.
What kind of API/functionality would you ideally want Nippy to expose
An API similar to the get-len
function in my commit, which, given a buf
and offset
, could return the type
and length
. Note I haven't returned the type
in that implementation currently, but on reflection since I've realised that I need it also.
I'm not familiar with [...] Agrona off-hand
The only reason I brought Agrona up was because I used it in the code I linked to. Specifically, Agrona provides an ergonomic API for working with on-heap and off-heap bytebuffers.
Thank you for the fast response 🙂
from nippy.
Hi Jeremy, thanks for the clarifications - that's helpful 👍
I suppose there's quite a lot of overlap here with #147 - much (all?) of this could happily done in userspace if the codec definitions in nippy.clj were introspectable/exposed somehow.
To be clear, I'd make a distinction between:
- Nippy's internal schema: mostly just the set of
[byte-id type length]
tuples. - The encoding of base types as per
java.io.DataOutput
and optional compression/encryption.
Exposing a public view of the internal schema (1) should in principle be relatively straight-forward.
As I understood it, #147 also concerns itself with (2) - which isn't Nippy specific, and potentially more of an undertaking depending on what the target platform offers.
For your particular use case- how far would it get you if Nippy core included something like a public nippy/type-ids
, maybe with explicit length in byes?
Seems that'd allow you to cut out ~90% of your branch code, and not depend on any fragile implementation details?
from nippy.
@refset 👍 Created #151 for next steps on public nippy/type-ids
.
Leaving this issue open specifically for custom bytebuffer support.
from nippy.
@huahaiy Hi Huaha Yang, thanks for bringing this to my attention - sounds promising!
Would be happy to see a PR for this 👍
from nippy.
Hi 🙂
We have been looking at this for XTDB recently in support of speeding up the ingestion pipeline and reducing unnecessary allocations. Specifically, we want to avoid the current need to thaw documents returned by the 'document-store' which then get immediately re-encoded/frozen into KVs bytebuffers for the 'index-store' (backed by RocksDB / LMDB etc.).
Instead the document-store could return a bytebuffer per document and from this XT should be able to construct the necessary KV bytebuffers by simply slicing and merging wrapped buffers (i.e. views with defined offsets and lengths) without any duplication or thawing at all.
In this branch, I have already extracted the necessary Nippy-internal codec information and created a get-len
function that can satisfy our immediate requirements to avoid any thawing or copying:
https://github.com/refset/xtdb/blob/df210146d1744b14c31fa29e994ac3932c54e8d5/core/src/xtdb/nippy_utils.clj
Note that we use Agrona extensively across XT already.
Do you have any feedback or thoughts on how this approach could perhaps evolve into a PR?
The capability to freeze to bytebuffers would also be useful but is not a current focus.
from nippy.
@refset Hi Jeremy-
I'm not expecting to have significant time this week to dig into this in detail.
And heads-up that I'm not familiar with XTBD or Agrona off-hand.
Would it be possible to try give a simplified high-level (/ ELI5) explanation of:
- What your objective is
- How this relates to the current issue re: support for freezing to a user-supplied bytebuffer
- What kind of API/functionality would you ideally want Nippy to expose
The easier you can make this for me to follow, the likelier I'll be able to get you a quick response.
Cheers!
from nippy.
I suppose there's quite a lot of overlap here with #147 - much (all?) of this could happily done in userspace if the codec definitions in nippy.clj
were introspectable/exposed somehow. Again, see that branch I mentioned for the ~small sections of nippy.clj
I needed to copy across so that I could write my own get-len
function - essentially just the type-id mappings and all the implied lengths (calculated by hand).
from nippy.
For your particular use case- how far would it get you if Nippy core included something like a public nippy/type-ids, maybe with explicit length in bytes?
Seems that'd allow you to cut out ~90% of your branch code, and not depend on any fragile implementation details?
Agreed - I think that would work great 🙂
from nippy.
Just to summarise current status re: support for user-supplied bytebuffers:
- #151 is being worked on separately, which might/not be useful for some related use cases.
- #140 (support for user-supplied bytebuffers) is still independently interesting.
- Next steps would be for someone to provide a sketch/PoC PR, or ideas re: what the API might look like.
from nippy.
Closing for inactivity as part of issue triage
from nippy.
Related Issues (20)
- Bump encore to 3.49 to avoid "Insufficient com.taoensso/encore version" error in timbre HOT 5
- Support for Clojure CLR? HOT 6
- Upgrade data encryption to use new Tempel lib
- Handling unfreezable data HOT 6
- Possible to create deterministic encryption? HOT 5
- should `extend-freeze` affect `freezable?` HOT 3
- checksum change for serialised output between 3.2.0 and 3.3.0 HOT 5
- JDK 21 benchmarks HOT 8
- Nippy v3.4.0-RC1 HOT 1
- Nippy v3.4.0 final HOT 1
- Question: does Nippy Zstandard support training? HOT 1
- v3.4.0-RC1: zstd cannot be decompressed without specifying the compressor HOT 3
- Support UUID (and inst)? HOT 2
- Nippy doesn't know how to freeze metadata from next.jdbc HOT 16
- Cannot `freeze-to-file` and `thaw-from-file` objects that are larger than 2GB HOT 3
- CVE on tools-reader 1.4.2 HOT 2
- A question about extend-freeze HOT 1
- The with-cache macro is marked private HOT 4
- Reducing the dependency footprint HOT 2
- String array deserialization HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nippy.