Git Product home page Git Product logo

Comments (9)

rvagg avatar rvagg commented on August 30, 2024 1

We should enumerate some reasonable use-cases for this so we can figure out if this proposal would make sense for those. It seems to me that there's going to be special-casing no matter how we implement such a thing, this one has the benefit of reusing the "inline CID" pattern which I think we've agreed needs to be baked into our stack. But there's going to be additional "is this a 0x2f + identity CID?" check at various points of the stack too, which will break some abstractions.

from specs.

Stebalien avatar Stebalien commented on August 30, 2024

I take it that's just everything concatenated? Works for me! Also note: utf8(/) is the multicodec for "paths" (:trollface:), so every part of this is multicodec prefixed.

Also related: multiformats/multiformats#55.

from specs.

mikeal avatar mikeal commented on August 30, 2024

Also note: utf8(/) is the multicodec for "paths" (:trollface:), so every part of this is multicodec prefixed.

Oh that is awesome!

from specs.

rvagg avatar rvagg commented on August 30, 2024

For clarity, can you describe the sections of bytes that end up forming the final CID? I'm not quite clear on how you're getting to the end product. Is it just the | multicodec | multihash | utf8(path) | - which would be backward incompatible with the current CID parsers. Or would it be | pathedlink-multicodec | identity | multicodec | multihash | utf8(path) |, so a CID+path wrapped up in a raw+identity CID, which is how I'm interpreting "You could use the identity multicodec to inline the relevant data into a single CID".

from specs.

mikeal avatar mikeal commented on August 30, 2024

For clarity, can you describe the sections of bytes that end up forming the final CID?

Sure.

It’s also worth pointing out that the format is essentially just a CID without the proceeding 1 (CIDv1) followed by the path.

Here’s a fully inline pathed CID.

| CIDv1 | pathed-link-multicodec | identity-multicodec | identity length | link-codec-multicodec | link-hash-multicodec | link-hash-length | link-hash-bytes | utf8(path) |

I should also note that we’ll need to apply some rules to the path in order to ensure determinism (no leading or trailing slash).

from specs.

Stebalien avatar Stebalien commented on August 30, 2024

Wait, so you're not just concatenating a CID and a path? You're suggesting a new object type, stored as an "inline/identity" CID? I mean, that works, but it seems like just extending the CID format to allow tacking on a path would be cleaner.

from specs.

mikeal avatar mikeal commented on August 30, 2024

The goal here is to add this functionality in a generic way to IPLD (in other words, it should work for links to/from any existing block format) without actually breaking the IPLD Data Model (which extending the feature set of links would do).

This is “just a new block format” specifically for pathed links. That means it has a representation that conforms to the existing IPLD Data Model as it is today without any changes. Since it’s implemented as a block format but is intended to be a link itself, the sane thing to do is to embed it in an identity multihash.

It may seem a little hacky but it’s only 2 extra bytes of identity multihash overhead, which you actually gain back in the block format when compared to encoding the same data in CBOR.

The important thing is that there is an identifier (multicodec) in any link that you can use to identify pathed links. This would allow any IPLD user to add pathed link support to their implementation and have it work across all codecs without changing or breaking the existing data model and it would still produce graphs that contain all the relevant linking information in just the Data Model representation.

In practice, I don’t think there’s much difference between this and “extending the CID format” other than the fact that this is reverse compatible with systems that don’t understand pathed links. If you imagine extending the format, you’d end up putting bytes somewhere that say “this is a pathed link,” which we’re effectively doing with CID’s existing codec field, we’re just then eating two bytes for the identity multihash which we might have avoided had we gone a route that wasn’t reverse compatible.

from specs.

Stebalien avatar Stebalien commented on August 30, 2024

I guess... My concerns are:

  1. Unless handled "specially", these links will appear to be new blocks and would have to be handled at a higher layer (e.g., ADL). I have to wonder how this would interact with pathing, selectors, etc.

In terms of not breaking things, yeah, I get that. I'm just concerned about this feature having limited use if it lives outside the core data model.

from specs.

mikeal avatar mikeal commented on August 30, 2024

In terms of not breaking things, yeah, I get that. I'm just concerned about this feature having limited use if it lives outside the core data model.

We sort of have to pick one of these. If it changes the core data model we break everything, including the existing codec definitions, so that ship has sailed.

That said, pretty much everything we’ve built w/ IPLD includes things beyond the data model. IPLD Schemas are the obvious example, and I’m curious to know if there’s a way that we could get pathed links into IPLD Schemas.

from specs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.