Git Product home page Git Product logo

Comments (15)

niklasl avatar niklasl commented on September 26, 2024 1

I do agree that that syntax reasonably won't fly. I'm not sure that using @annotate on @type is such a special case though, but let's see where this goes.

I'll reply in sections here, to explain how I've reasoned for context, along with a better workaround and a possible improvement for JSON-LD-Star.

First Workaround Approach

My thought regarding @base and @vocab went that if we need to use a regular link for rdf:type here, I'd like to define it like:

{
  "@context": {
    "@vocab": "http://example.org/ns#",
    "type": {"@id": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", "@type": "@vocab"}
  },
  "type": "Item"
}

But that ends up with the same problem in that we cannot use @annotate on the simple string value ("Item"). The partial workaround would be to use @id with a property-scoped @base. But since that would use IRI resolution instead of string concatenation, it'd have to look like this:

{
  "@context": {
    "@vocab": "http://example.org/ns#",
    "type": {
      "@id": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type",
      "@context": {"@base": "http://example.org/ns#"}
    }
  },
  "type": {
    "@id": "#Item",
    "@annotation": {
      "assertedBy": {"name": "Somebody"}
    }
  }
}

due to:

resolve_iri("http://example.org/ns#", "Item") == "http://example.org/Item"  # Undesired
resolve_iri("http://example.org/ns#", "#Item") == "http://example.org/ns#Item"  # What We Want

Requiring users of @annotate in compact JSON-LD to keep track of the specific IRI syntax used and occasionally prepend a # on what used to be simple string values for @type doesn't seem in line with the ergonomics of the compact form. (And that's excluding the cases where keys to be used as values for @type are defined in the context from different vocabularies, which simply would break down here. Only prefixes would still work, and those aren't as palatable (IMHO) for fully compact JSON-LD.)

Further Trouble

Alas, I thought, this thus further extends to any term defined using "@type": "@vocab", making them impossible to combine with @annotate while preserving their compact forms. So for example, this:

{
  "@context": {
    "@vocab": "http://example.org/ns#",
      "language": {
        "@type": "@vocab",
        "@context": {"@vocab": "http://example.net/language#"}
      }
  },
  "language": "eng"
}

Would encounter the very same problem if the value of "language" is to be @annotate:d.

(Of course, the workaround would work better with @vocab URI:s ending in / rather than # (arguably a better practise). But it would still require a separate term definition for language, along with a change from a string to an @id. And still break down if specific keys are defined to be used as values for the term.)

A Better Way

At this point, I realized that I could possibly use @set here to provide the string. (Quite uncommon if ever used in the wild, but supported by the expansion algorithm.) Interestingly, @set doesn't appear to require the value to be an array, but works with a single value (aside: as do @list).

Thus, the example with language becomes:

{
  "@context": {
    "@vocab": "http://example.org/ns#",
      "language": {
        "@type": "@vocab",
        "@context": {"@vocab": "http://example.net/language#"}
      }
  },
  "language": {
    "@set": "eng",
    "@annotation": {
      "assertedBy": {"name": "Somebody"}
    }
  }
}

Which properly expands to:

_:b0 <http://example.org/ns#language> <http://example.net/language#eng> {| :assertedBy [ :name "Somebody" ] |} .

And finally, this makes the workaround for @type fairly palatable (albeit requiring a separate keyword):

{
  "@context": {
    "@vocab": "http://example.org/ns#",
    "type": {"@id": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", "@type": "@vocab"}
  },
  "type": {
    "@set": "Item",
    "@annotation": {
      "assertedBy": {"name": "Somebody"}
    }
  }
}

A Slight Improvement For JSON-LD-Star

At this stage, it might be argued that @type ought to be able to handle the same variations in values as other, regular @vocab-coerced keys do. Specifically, given that the above use of @set is legal, I'd probably propose that, along with @annotate, the spec should allow this form for @type to facilitate them being used in conjunction:

{
  "@context": {
    "@vocab": "http://example.org/ns#"
  },
  "@type": {
    "@set": "Item",
    "@annotation": {
      "assertedBy": {"name": "Somebody"}
    }
  }
}

Had it not already worked for expansion as per the language example above, it might be a strech to propose it. But as things stand, it's not too far off I think. It might also solve the @type-scoped context issue when used with @annotate that you mentioned.

Regardless of that proposal though, I agree it is a good idea to include this workaround in the documentation to make it clear how to use @annotate on @type values, as well as for other @type-coerced terms.

(Of course, one thing preventing this from being the practise would be if this use of @set is considered misuse and we would like to forbid it in a future update of JSON-LD. I wouldn't expect that, but it ought to be considered.)

from json-ld-star.

gkellogg avatar gkellogg commented on September 26, 2024

I think the "@type": {"@type" ...} syntax would add too much complexity for a special case. The best workaround would be to do it on an rdf:type property, as with other property annotations. This leaves out certain other uses of @type (e.g., type-scoped contexts), but there are workarounds for this as well.

I don't quite understand your thoughts on @base and @vocab, as the value of @type or rdf:type would typically be vocabulary-relative. But, I think it's probably worth adding an example/aside in the spec to describe this use case.

from json-ld-star.

gkellogg avatar gkellogg commented on September 26, 2024

Great analysis, @niklasl. Although the use of @set is quite novel (not unlike the use of @graph at the top-level), it certainly works, and with a suitable @context definition could be aliased to something more palatable.

Interested to hear @pchampin's thoughts.

from json-ld-star.

Peeja avatar Peeja commented on September 26, 2024

This feels akin to an issue I just faced in m-ld (or really, in json-rql), where I wanted to transform some data in a way that would have moved a property's name from a property position (where it's @vocab-relative) to an object position (where it would be @base-relative). Specifically, I wanted to find existing values in properties I was about to update, so I could remove the old values first.

const existingData = await state.read<Construct>({
  // Give me all all the triples...
  "@construct": { "@id": "?id", "?property": "?value" },
  // ...where...
  "@where": {
    // ...the triple is in the graph, and...
    "@graph": {
      "@id": "?id",
      "?property": "?value",
    },
    // ...the variables `?id` and `?property` are bound to any of the
    // id-property pairs in the data we're trying to insert--that is, it's a
    // subject-property pair where we're trying to write a new value, and will
    // need to delete the old one.
    "@values": subjects.flatMap((subject) =>
      // For every property in the data we're writing for this subject...
      Object.keys(subject)
        // ...except the `@id`...
        .filter((key) => key != "@id")
        // ...give us a `@values` binding of the subject's `@id` and that
        // property.
        .map((key) => ({
          "?id": { "@id": subject["@id"] },
          "?property": { "@id": key },
        })),
    ),
  },
});

The problem with this code is the way ?property is bound:

"?property": { "@id": key }

If the properties given are all absolute IRIs, that works; but if they're relative, this will resolve them relative to @base, where they should be resolved relative to @vocab.

@gsvarovsky's current solution was for m-ld to support the use of @vocab in place of @id to form a node reference that's resolved relative to @vocab; so:

"?property": { "@vocab": key }

Since @vocab has no current application outside of contexts, that should be unambiguous. If that were applied here, it would read:

{
  "@context": {
    "@vocab": "http://example.org/ns#",
    "type": {"@id": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", "@type": "@vocab"}
  },
  "type": {
    "@vocab": "Item",
    "@annotation": {
      "assertedBy": {"name": "Somebody"}
    }
  }
}

It does, unfortunately, require extending JSON-LD itself, and may be a bigger conversation, but @vocab feels more sensible to me than @set here (since it's a vocab term, and not—on its face, at least—a set), and I think this use of @vocab may end up solving a number of issues like this.


A counter-argument in this case, though: the term is already defined as "@type": "@vocab" in the context, so maybe this is just redundant. Or maybe this should mean we can leave the @type out of the context?

from json-ld-star.

pchampin avatar pchampin commented on September 26, 2024

This creative use of @set does feel hacky, but it is appealing that it already almost work. I write "almost", because currently, @annotation can not be used with @set, and for a good reason (@set may generate several triples, we wouldn't know which one is meant to be annotated). We could weaken this constraint by saying that @annotation can be used with @set unless its value is an array, but that looks contrived.

The use of @vocab in instance data, as a vocab-version of @id, is interesting and seems cleaner, but may have non-trivial consequences of the existing algorithms. For example: we should check that @id and @vocab are never used together in the same node-object; when compacting, how should the algorithm choose between @id and @vocab?

from json-ld-star.

gkellogg avatar gkellogg commented on September 26, 2024

The use of @vocab in instance data, as a vocab-version of @id, is interesting and seems cleaner, but may have non-trivial consequences of the existing algorithms. For example: we should check that @id and @vocab are never used together in the same node-object; when compacting, how should the algorithm choose between @id and @vocab?

I think the overloading of keywords in different situations has created issues in the past (e.g, @type as node type and value datatype and @graph as named-graph introduction vs @included synonym in a narrow case). Considering how to use something like @vocab in a node object might be useful, but the discussion belongs on the json-ld-syntax repo.

As for @set, it is certainly non-obvious (and devious), but because it is so close, and relates to the rdf-star case, I think it's worth exploring further.

from json-ld-star.

niklasl avatar niklasl commented on September 26, 2024

A @vocab-version of @id could be useful in many places (e.g. for representing fairly compact forms equivalent to :term :relationTo :otherTerm in Turtle, without requiring @type: @vocab definitions for each such relation; or in hash-named vocabularies, nicely used in conjunction with @container).

I do fear that it would be quite the overloading though, and more importantly, not easy to read (it might seem to set the @vocab for the current scope, just like in RDFa). That's why I've pondered a @symbol keyword for that, with the big drawback of yet another keyword. So while I agree that in some form could be useful, and solve much of this annotation issue, the naming is a big issue.

But, as @gkellogg says, this in itself is not an issue for json-ld-star. @Peeja would you mind raising your issue over at https://github.com/w3c/json-ld-syntax, so we can reference that here (and discuss these details over there)?

from json-ld-star.

niklasl avatar niklasl commented on September 26, 2024

As for the issue at hand, there is one form for which a "symbol"-key wouldn't be a solution, but which this @set "trick" handles: if you really want to preserve a datatype-coerced value form, but still want to annotate it. Otherwise, by using @value, the coercion is not applied, and an editor of that data (in the orthogonal course of annotating), would have to inject the correct @type as well. Seems like an edge case, but important to note nonetheless.

I didn't know about the prohibition of @set for @annotation, but I certainly understand why it must be limited to one value if allowed here. Granted, I am not sure how intuitive and teachable that use might be. (Though I expect this practise of combined triple reification, annotation and assertion to be such an advanced pattern that this special case might not be that hard to grasp for its practitioners. One might argue that that audience wouldn't care about preserving compact forms verbatim anyway, but I'd disagree. The compact form is the raison d'être for JSON-LD, so it's worth being diligent about.)

If @set isn't acceptable, and a new keyword would be needed anyway, I can throw a creative alternative into the mix (which I can see would be a very hard sell, so I do this mainly for perspective): turn the @annotation design inside out (apologies if this has been discussed and dismissed already!). That is instead of using the @annotation keyword and hang the annotation off a "real" node, introduce an annotatable "predicate node" instead, expressed using a new @object key, whose value is the actual object (a node reference object, a value object or a scalar), and all other keys are considered properties of the relation (the "annotations").

(That would be similar in design to regular RDF qualification but without the need for distinct qualified properties (which RDF* is for, for better or worse), and even more close to https://schema.org/Role (but with one special key instead of that design's odd repetition of the predicate). It does fully preclude what I assume is a design goal though; that annotations should in general be ignorable by "casual consumers".)

from json-ld-star.

niklasl avatar niklasl commented on September 26, 2024

I've been mulling some more over this case, and I'd like to share my thoughts so far, along with a new concrete proposal.

Motivations

To keep compact JSON-LD predictable and succinct enough even for annotations. Of special concern are values for @type and type-coerced values (e.g. dates provided as plain strings).

Example Data

This example (given in Turtle-star) represents a diff result from two revisions of a description, in a kind of "blame" mode:

prefix : <https://schema.org/>
prefix diff: <http://example.org/graph-diff#>

</item/1> a :CreativeWork {| diff:addedIn <rev1> ; diff:removedIn <rev2> |} ,
    :Book {| diff:addedIn <rev2> |} ;
  :datePublished "2021-12-22"^^:Date {| diff:addedIn <rev2> |} .

(This is a simplified form of a concrete graph diff/blame case we have in the national union catalogue at the National Library of Sweden.)

First Attempt: Brittle Pairings

All of these suffer from a value and annotation coordination problem, which I'd say is brittle. But I present them here to put all options on the table.

A. One way would be to intertwine strings and "pure" annotation objects, by pairing them up ((string, annotation), ...), like:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "datePublished": {"@type": "Date"},
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn"
  },

  "@id": "/item/1",
  "@type": [
    "CreativeWork",
    {"@annotation": { "addedIn": {"@id": "rev1"}, "removedIn": {"@id": "rev2"}}},
    "Book",
    {"@annotation": { "addedIn": {"@id": "rev2"} }}
  ],
  "datePublished": [
    "2021-12-22",
    {"@annotation": { "addedIn": {"@id": "rev2"} }}
  ]
}

That may not be too bad, but it isn't a predictbly strict form of JSON (mixed strings and objects, making for some tricky coding, and which e.g. Elasticsearch would have a hard time handling). Also, it would force annotated single values to be a pair array.

B. Define a "magic annotation key" pattern, where regular JSON keys have annotation key counterparts, constructed by appending " @annotation" (space and keyword) to the regular key. Like:

{
  // "@context": ...,

  "@id": "/item/1",
  "@type": ["CreativeWork", "Book"],
  "@type @annotation": [
    { "addedIn": {"@id": "rev1"}, "removedIn": {"@id": "rev2"}
    },
    { "addedIn": {"@id": "rev2"} }
  ],
  "datePublished": "2021-12-22",
  "datePublished @annotation": { "addedIn": {"@id": "rev2"} },
}

This has no drawbacks for predicability of the JSON, as it always adds, never alters, unannotated compact JSON-LD. It is however, quite unacceptable as the "magic key" form is weird, not nice for algorithms, and certainly looks and feels like a hack.

C. A structural variant of the "magic annotation key", albeit with more coordination problems, would be to provide a separate "annotation index":

{
  // "@context": ...,

  "@id": "/item/1",
  "@type": ["Thing", "Book"],
  "datePublished": "2021-12-22",
  "@annotation": {
    "@index": {
      "@type": [
        { "addedIn": {"@id": "rev1"}, "removedIn": {"@id": "rev2"}
        },
        { "addedIn": {"@id": "rev2"} }
      ],
      "datePublished": { "addedIn": {"@id": "rev2"} }
    }
  }
}

That is so out of band though that it doesn't seem ergonomic neither to edit nor to consume.

My personal conclusion is that neither of these options will do, not the least ergonomically (for a "JSON hacker", e.g. during regular web development, using these forms directly). I would argue that due to the nature of JSON and the compact forms sought after, there's no "room" for annotations syntactically.

But we don't have to give up just yet. Since JSON-LD is able to cater for e.g. language tags (through indexed language containers) and other more advanced forms of RDF (including indexed graph containers!) in a compact and succinct way, let's explore that option too.

A Better Form: Annotation Containers

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn",
    "typeAnnotated": {
      "@container": "@annotation",
      "@id": "@type",
      "@value": "type"
    },
    "datePublishedAnnotated": {
      "@container": "@annotation",
      "@id": "datePublished",
      "@value": "value",
      "@type": "Date"
    }
  },

  "@id": "/item/1",
  "typeAnnotated": [
    {
      "type": "CreativeWork",
      "addedIn": {"@id": "rev1"},
      "removedIn": {"@id": "rev2"}
    },
    {
      "type": "Book",
      "addedIn": {"@id": "rev2"}
    }
  ],
  "datePublishedAnnotated": {
    "value": "2021-12-22",
    "addedIn": {"@id": "rev2"}
  }
}

In the form above, a new kind of term definition is introduced, called "annotation containers". It is defined by including a "@container": "@annotation" declaration, along with a special use of "@value" to define the nested key used in objects to provide the actual value to be annotated. (Those values can be any single value, not just a string, but any object (commonly node references). Arrays would not be allowed unless combined @annotation and @list containers are allowed.) Technically, this key is @protected within an implicit local context (which is derived from the current context by default, but you can of course define a term-local context here as well (for which this key would then be @protected, since it is crucial)).

In the data, the values for a term defined as an annotation container are the actual annotations, except for the special key defined for the term using @value (this is a bit reminiscent in shape of the @object idea I had in the previous comment above, but only used for terms explicitly defined as annotation containers).

This feature would also allow forms like this:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "contribution": {
      "@id": "contributor",
      "@container": ["@annotation", "@set"],
      "@value": "agent"
    }
  },
  "@id": "/item/1",
  "contribution": [
    {
      "roleName": "Author of the introduction",
      "agent": {"@id": "/agent/one"}
    }
  ]
}

Expanding to:

prefix : <https://schema.org/>

</item/1> :contributor </agent/one> {| :roleName "Author of the introduction" |} .

In order for this proposal to work, we would have to accept two things in its design:

  1. That annotation objects are foremost their annotations, and that their value object is given with a special key within them. Personally, I find this not only acceptable, but in certain cases preferable. But it's something to be carefully evaluated through practise.

  2. The means by which the special key for the actual value is defined. We could use @set here instead, as it would almost fully work. But it arguably wouldn't read very well. It may also be that we would need some token for an "unaliased" object key in the value objects. If that would be necessary, I guess a single-valued @set would be required (or, alas, a new key), as @value does not work for objects. But I cannot see any need for such unaliased keys (as this form is exclusively for compacted annotations). The motivation for using @value in annotation container term definitions above is for contextual readability (as JSON-LD keys within term definitions already have contextual meaning that differs from their use in actual data). But the exact choice of key here (i.e. within the definition) is certainly up for further debate if this proposal is to be considered further.

It must also be algorithmically vetted, of course.

Thoughts?

from json-ld-star.

gkellogg avatar gkellogg commented on September 26, 2024

@niklasl thanks for the detailed analysis. It is reminiscent of the schema.org Role pattern (although without explicit intermediaries), where an intermediate object is used to hold properties on the relationship; something which was also considered for RDF-star. In this case, however, it's an entirely syntactic construct, so doesn't have an impact on the actual data model.

One thing that bothers me, though, is that the @value term definitional component somewhat parallels a scoped context, and perhaps there is a way to combine the concepts, perhaps relying upon the @container: @annotation component of the enclosing term definition. An alternative would be to just give special meaning to an annotation property which was the same as the @id in the term definition (although this would provide it's own limitation).

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn",
    "typeAnnotated": {
      "@container": "@annotation",
      "@id": "@type"
    },
    "datePublishedAnnotated": {
      "@container": "@annotation",
      "@id": "datePublished",
      "@type": "Date"
    }
  },
  "@id": "/item/1",
  "typeAnnotated": [{
    "@type": "CreativeWork",
    "addedIn": {"@id": "rev1"},
    "removedIn": {"@id": "rev2"}
  }, {
    "@type": "Book",
    "addedIn": {"@id": "rev2"}
  }],
  "datePublishedAnnotated": {
    "datePublished": "2021-12-22",
    "addedIn": {"@id": "rev2"}
  }
}

If the syntactic representation of the form of the @id in the term definition were retained, then another form could be used within the annotation, but that feels like a hack, too.

Do you suggest this as an addition to the existing annotation mechanism, or as a replacement?

from json-ld-star.

niklasl avatar niklasl commented on September 26, 2024

Thanks @gkellogg for your further consideration.

Your alternative is very interesting, and may be useful for many cases (it does exactly mirror the schema:Role design). In certain cases it could be troublesome though, as it does "block" the term used for further use on the annotation itself. E.g. if the annotation itself is to be described with the same property (say for annotated dc:date properties, or for that matter, if the annotation needs to be typed (using @type) with some special subclass of rdf:Statement). Example:

</item/1>
  a :Nothing { a :Fallacy ; source <#issuecomment-x> } ;
  :date "1900" { :date "2021"; source <#issuecomment-x> } .

I wouldn't necessarily rule this alternative out just yet, but it needs to be carefully considered whether this shortcoming is more than an edge case, since it might crop up as a real issue, perhaps later on in the wild, when it would be hard, if at all possible, to address it.

Thus it seems valuable to be able to control what the "special key for the object" is to be.

We know that @value won't work here in the actual data, as it provides the lexical string for a literal, and we're dealing with compact coerced strings or types here. Also, my suggestion did make quite a "creative" use of it in a context definition, which as you pointed out was suspiciously like scoped contexts (it would probably be implemented internally using those). So we need another key, and a more cohesive design.

I did suggest this as an addition, for use with compact forms. The existing proposed form for annotations in JSON-LD does work for the expanded form, and for node references in compact form (unless an @id-coerced form is used). It doesn't work fully for compact JSON-LD though, which is the issue at hand: it is not possible to annotate compact strings without using expanded forms for them.

With that said, I will present a replacement form at the end of this comment. Anyone reading this can feel free to jump to that "Alternative 4" below if this gets long-winded (again, I present my alternatives and reasoning here for full comprehension).

Alternative 2: Bare and Scoped Object Keys

Going back to @set, as we examined earlier in this comment thread, it would work when it is not an array. It might work here if properly restricted.

If we define a "bare" form, to be used within annotation containers to provide the object itself, we can then make use of scoped contexts to select our own, contextually comprehensible key for the object, depending on the term for which the annotation container is defined.

Going down that route, scoped context on annotation containers would likely be the common pattern, to define a contextually relevant name for the annotated value itself (instead of a "bare" key). Perhaps the preferred choice would be the repeated-property-pattern reminiscent of schema:Role, as in your alternative. But some designs may prefer other forms, like the above, or to mimic the qualified relation pattern, as in my previous contribution example. (I find that to be of special interest, as this might provide a sort of transition from that pattern to RDF-star annotations, FWIW.)

Ruling out @value (this and @id work alongside @annotation for expanded forms only), here are some candidates for "a key which is not a term":

  1. @set would work if it is, within annotation containers, limited to a single value.
  2. @nest doesn't read too poorly, but it clashes with its regular use for nesting properties, not values.
  3. @annotation could, within annotation containers, carry the value...

Here using @set, but you could substitute that with the other options in the following examples. I didn't find those more palatable though (those too seem more or less like hacks when used like this).

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn",
    "typeAnnotated": {"@container": "@annotation", "@id": "@type"},
    "datePublishedAnnotated": {
      "@container": "@annotation",
      "@id": "datePublished",
      "@type": "Date"
    }
  },

  "@id": "/item/1",
  "typeAnnotated": [
    {
      "@set": "CreativeWork",
      "addedIn": {"@id": "rev1"},
      "removedIn": {"@id": "rev2"}
    },
    {
      "@set": "Book",
      "addedIn": {"@id": "rev2"}
    }
  ],
  "datePublishedAnnotated": {
    "@set": "2021-12-22",
    "addedIn": {"@id": "rev2"}
  }
}

To achieve a desired form, use a scoped context to alias the bare key accordingly (here per my first example, you could use to get the schema:Role form as well, or the "qualified relation" pattern as well):

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn",
    "typeAnnotated": {
      "@container": "@annotation",
      "@id": "@type",
      "@context": {"type": "@set"}
    },
    "datePublishedAnnotated": {
      "@container": "@annotation",
      "@id": "datePublished",
      "@type": "Date",
      "@context": {"value": "@set"}
    }
  },

  "@id": "/item/1",
  "typeAnnotated": [
    {
      "type": "CreativeWork",
      "addedIn": {"@id": "rev1"},
      "removedIn": {"@id": "rev2"}
    },
    {
      "type": "Book",
      "addedIn": {"@id": "rev2"}
    }
  ],
  "datePublishedAnnotated": {
    "value": "2021-12-22",
    "addedIn": {"@id": "rev2"}
  }
}

Alternative 3a: Value-Indexed Annotation Containers

An alternative would be if annotation containers where indexed by their values:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn",
    "typeAnnotated": {
      "@container": "@annotation",
      "@id": "@type"
    },
    "datePublishedAnnotated": {
      "@container": "@annotation",
      "@id": "datePublished",
      "@type": "Date"
    }
  },

  "@id": "/item/1",
  "typeAnnotated": {
    "CreativeWork": {
      "addedIn": {"@id": "rev1"},
      "removedIn": {"@id": "rev2"}
    },
    "Book": {
      "addedIn": {"@id": "rev2"}
    }
  },
  "datePublishedAnnotated": {
    "2021-12-22": {
      "addedIn": {"@id": "rev2"}
    }
  }
}

Quite intriguingly, this annotation index form would actually syntactically enforce the "set nature" of RDF. We know that the same statement simply cannot be repeated; it would state the same fact (and RDF does not count those, since there are no multisets in RDF). This is mirrored by the fact that JSON objects cannot have their keys repeated (that just overwrites them).

Alternative 3b: Turning Index Items Into Pairs

A variant to this @index approach would be to, by default, have "destructed" annotation indexes, putting the object used as key into an explicit @index. This could of course be aliased using a scoped context to define which key to be used for the index. (That would allow all the variants of the bare form above, the difference would just be of using @index for the bare key).

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn",
    "typeAnnotated": {
      "@id": "@type",
      "@container": "@annotation"
    },
    "datePublishedAnnotated": {
      "@id": "datePublished",
      "@container": "@annotation",
      "@type": "Date"
    }
  },

  "@id": "/item/1",
  "typeAnnotated": [
    {
      "@index": "CreativeWork",
      "addedIn": {"@id": "rev1"},
      "removedIn": {"@id": "rev2"}
    },
    {
      "@index": "Book",
      "addedIn": {"@id": "rev2"}
    }
  ],
  "datePublishedAnnotated": {
    "@index": "2021-12-22",
    "addedIn": {"@id": "rev2"}
  }
}

Then, getting to the previous form above, where values are index keys, would be done by explicitly declaring: "@container": ["@annotation", "@index"] on the annotation containers.

Expanded Annotated Type

As this issue started out to be specifically about type (and not all "strings with special meaning"), there is one glaring omission which we need to consider: the case of expanded annotated types. Given that JSON-LD does not expand @type to http://www.w3.org/1999/02/22-rdf-syntax-ns#type, it would be prudent to define a form for it when annotated. Since annotation containers are for compact forms, we could only build on parts of them to get what we need here. It needn't be so hard if some variant of the "bare" key from above could be paired with the annotation itself, like:

{
  "@id": "/item/1",
  "@type": [
    {
      "@index": "CreativeWork",
      "@annotation": {
        "http://example.org/diff#addedIn": {"@id": "rev1"},
        "http://example.org/diff#removedIn": {"@id": "rev2"}
      }
    }
  ]
}

However, now it does seem that we have ended up in a somewhat complicated situation, with varying designs for expanded and compacted forms. There is another form (alluded to before) which now presents itself.

Alternative 4: Annotation Objects

This is a replacement for the currently defined @annotation mechanism. It is completely separate from the above alternatives (in an attempt to solve their challenges in a better, more uniform manner).

The key concept here is the introduction of annotation objects. These are objects representing the annotation of the arc itself, having one special key, @annotated, to provide the object of the arc. The presence of this key is what determines the nature of the object to be an annotation object. (Similarly to how the presence of @value, @list or @graph determines the nature of value, list or graph objects, respectively.)

(They are structurally similar to the form of annotation containers, but do not require context definitions, since they represent the default form whenever annotations are present on an arc.)

(It is also quite similar to the schema:Role design but has this specific key to carry the object, rather than repeating the predicate. It even harkens back to the olden days of rdf:value in early RDF designs for reification, but that's an aside. Here it certainly represents an annotated arc as defined in RDF-star, and not specific RDF vocabularies.)

1. "Bare" Form

Including an annotated exampleOfWork link to show how this would be the regular form for annotations:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn",
    "datePublished": {"@id": "datePublished", "@type": "Date"}
  },

  "@id": "/item/1",
  "@type": [
    {
      "@annotated": "CreativeWork",
      "addedIn": {"@id": "rev1"},
      "removedIn": {"@id": "rev2"}
    },
    {
      "@annotated": "Book",
      "addedIn": {"@id": "rev2"}
    }
  ],
  "datePublished": {
    "@annotated": "2021-12-22",
    "addedIn": {"@id": "rev2"}
  },
  "exampleOfWork": {
    "@annotated": {"@id": "/work/1"},
    "addedIn": {"@id": "rev3"}
  }
}

(Note that annotation objects must be allowed as values for @type, provided that the @annotated value is an acceptable @type value.)

2. Using Scoped Contexts

A mechanism would be required for these terms to be selected for compaction, over the bare form, based on whether their objects are annotation objects. Here this is defined using a new @type: @annotated declaration on term definitions in the context.

This is for my original example:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn",
    "typeAnnotated": {
      "@id": "@type",
      "@type": "@annotated",
      "@context": {"type": "@annotated"}
    },
    "datePublishedAnnotated": {
      "@id": "datePublished",
      "@type": "@annotated",
      "@context": {"value": {"@id": "@annotated", "@type": "Date"}}
    },
    "exampleOfWorkRef": {
      "@id": "exampleOfWork",
      "@type": "@annotated",
      "@context": {"ref": {"@id": "@annotated", "@type": "@id"}}
    }
  },

  "@id": "/item/1",
  "typeAnnotated": [
    {
      "type": "CreativeWork",
      "addedIn": {"@id": "rev1"},
      "removedIn": {"@id": "rev2"}
    },
    {
      "type": "Book",
      "addedIn": {"@id": "rev2"}
    }
  ],
  "datePublishedAnnotated": {
    "value": "2021-12-22",
    "addedIn": {"@id": "rev2"}
  },
  "exampleOfWorkRef": {
    "ref": "/work/1",
    "addedIn": {"@id": "rev3"}
  }
}

(This example is easily adapted to follow the schema:Role pattern as well of course, by changing the name of the @annotated alias.)

In general it would certainly make sense to keep @annotated as is though, or to define a global uniform alias and not scope them, this is just to illustrate how it is possible to fully control the form by basing it on the existing JSON-LD 1.1 mechanism.

Here is the "qualified relation" example from earlier, where it makes sense to use a scoped term:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "contribution": {
      "@id": "contributor",
      "@type": "@annotated",
      "@context": {"agent": "@annotated"}
    }
  },
  "@id": "/item/1",
  "contribution": [
    {
      "roleName": "Author of the introduction",
      "agent": {"@id": "/agent/one"}
    }
  ]
}

3. Solving the Expanded Annotated @type

Simply following the common form of annotation objects:

{
  "@id": "/item/1",
  "@type": [
    {
      "@annotated": "https://schema.org/CreativeWork",
      "http://example.org/diff#addedIn": {"@id": "rev1"},
      "http://example.org/diff#removedIn": {"@id": "rev2"}
    }
  ]
}

4. Used With @nest

To get close to the "old" annotation form, use the existing JSON-LD 1.1 @nest mechanism (here with object as a global alias for @annotated):

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "object": "@annotated",
    "annotation": "@nest",
    "addedIn": {"@id": "diff:addedIn", "@nest": "annotation"},
    "removedIn": {"@id": "diff:removedIn", "@nest": "annotation"},
    "datePublished": {"@id": "datePublished", "@type": "Date"}
  },

  "@id": "/item/1",
  "@type": [
    {
      "object": "CreativeWork",
      "annotation": {
        "addedIn": {"@id": "rev1"},
        "removedIn": {"@id": "rev2"}
      }
    },
    {
      "object": "Book",
      "annotation": {
        "addedIn": {"@id": "rev2"}
      }
    }
  ],
  "datePublished": {
    "object": "2021-12-22",
    "annotation": {
      "addedIn": {"@id": "rev2"}
    }
  },
  "exampleOfWork": {
    "object": {"@id": "/work/1"},
    "annotation": {
      "addedIn": {"@id": "rev3"}
    }
  }
}

5. Index By Annotated

Just using the existing JSON-LD 1.1 mechanism:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn",
    "typeByAnnotated": {
      "@id": "@type",
      "@container": "@index",
      "@index": "@annotated"
    },
    "datePublishedByAnnotated": {
      "@id": "datePublished",
      "@container": "@index",
      "@index": "@annotated"
    },
    "exampleOfWorkByAnnotated": {
      "@id": "exampleOfWork",
      "@container": "@index",
      "@index": "ref"
    },
    "ref": {"@id": "@annotated", "@type": "@id"}
  },

  "@id": "/item/1",
  "typeByAnnotated": {
    "CreativeWork": {
      "addedIn": {"@id": "rev1"},
      "removedIn": {"@id": "rev2"}
    },
    "Book": {
      "addedIn": {"@id": "rev2"}
    }
  },
  "datePublishedByAnnotated": {
    "2021-12-22": {
      "addedIn": {"@id": "rev2"}
    }
  },
  "exampleOfWorkByAnnotated": {
    "/work/1": {
      "addedIn": {"@id": "rev3"}
    }
  }
}

Annotation Objects Summary

Apart from consolidating expanded and compacted forms a bit better, these annotation objects differ from the @annotation design in one crucial way: it does not tamper with the node reference design in JSON-LD. Before the introduction of the @annotation keyword, users of JSON-LD could regularly merge objects with the same @id in flattened form, as well as count on JSON objects containing only an @id key to be node references (and not one or two, the other being @annotation), and for the rest to be regular node objects containing describing properties.

Furthermore, while annotation objects do have the crucial nature of making the presence of @annotated crucial for the meaning of the object, this is common practise in JSON-LD (as mentioned above, with the presence of e.g. @graph or @value determining the meaning of those objects).

There is an intricacy left to iron out regarding whether or not @annotated "inherits" the type coercion of the term it is a value for, and if so whether it can "override" that (see the examples using object and ref to get a sens of that intricacy).

Alternatives So Far

So far we have:

  1. The schema:Role form.
  2. The base form with scoped aliases (possibly using a restricted @set).
  3. The @index alternative in two variants, one completely restricted to "string value as key", and one more expansive to allow for all the variants given above.
  4. The @annotated form, which is a replacement to the current @annotation form altogether, and which leverages existing JSON-LD mechanisms fairly cohesively to handle most variants in a uniform manner.

I cannot fully assess what the cost would be to implement any of these in terms of expansion and compaction. To me it seems that alternative 4 has a certain simplicity to it and I hope we can pursue it further.

We ought to throw even more use cases at them as well of course (at the National Library we have the diff case, some uses of qualified relations, and some cases of basic source tracking which can be explored further if we get a tentative form working here).

from json-ld-star.

niklasl avatar niklasl commented on September 26, 2024

Clarification: I meant of course to redefine annotation objects in "Alternative 4" above (originally I named that differently, but attempted to unify things, alas missing to clarify that part).

from json-ld-star.

niklasl avatar niklasl commented on September 26, 2024

Having gone through these alternatives some more, it seems to me that "Alternative 3a: Value-Indexed Annotation Containers" is also worthy of close scrutiny. That alternative caters solely for the strings that are impossible to use the existing @annotation form for, i.e. @type and specifically type-coerced strings (and plain strings, for those who eschew @value in compact form). It is mostly unobtrusive, controlled through context, and solves this issue, bar for what to use for @type in expanded form (perhaps @index + @annotation could work there, as also exemplified above).

(This being so, of course, provided that the existing @annotation form is ultimately acceptable to now have to check for a count of 1, being @id, or 2 being that and @annotation, to conclude that an object is a node reference. It might be, considered that is it unobtrusive in other respects.)

I base this evaluation on the same premise that use-cases for RDF-star annotations appear to have in general: that usage, i.e. data access, should go unaltered even with the presence of annotations. As we've seen above it is alas impossible to fully reach that, but this form at least only tackles the parts that cannot be identical, in JSON, in annotated form. And for published data (or data indexed in JSON-databases or search engines), it would be possible to duplicate just these parts (in a denormalization step outside of the specs of course) to make it fully queryable as if unannotated. Like:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "diff": "http://example.org/diff#",
    "addedIn": "diff:addedIn",
    "removedIn": "diff:removedIn",
    "typeAnnotated": {
      "@container": "@annotation",
      "@id": "@type"
    },
    "datePublished": {"@type": "Date"},
    "datePublishedAnnotated": {
      "@container": "@annotation",
      "@id": "datePublished",
      "@type": "Date"
    }
  },

  "@id": "/item/1",
  "@type": ["CreativeWork", "Book"],
  "typeAnnotated": {
    "CreativeWork": {
      "addedIn": {"@id": "rev1"},
      "removedIn": {"@id": "rev2"}
    },
    "Book": {
      "addedIn": {"@id": "rev2"}
    }
  },
  "datePublished": "2021-12-22",
  "datePublishedAnnotated": {
    "2021-12-22": {
      "addedIn": {"@id": "rev2"}
    }
  }
}

Due to the compact form of the annotation index (where the values are keys), even this denormalized form is fairly readable should one need to (and again, this is a form intended for indexing and possibly "casual consumption"; not a canonical form for storage). Also, it has the added benefit of "securing" from confusion that may otherwise arise from multiple occurrences of the same object with different annotations, which is not logically diferent (and should, I presume, be merged during flattening).

from json-ld-star.

gkellogg avatar gkellogg commented on September 26, 2024

In certain cases it could be troublesome though, as it does "block" the term used for further use on the annotation itself. E.g. if the annotation itself is to be described with the same property (say for annotated dc:date properties, or for that matter, if the annotation needs to be typed (using @type) with some special subclass of rdf:Statement).

Scoped contexts might address this, if a property-scoped context were invoked on the annotation property, that same term could be interpreted differently within the annotation. Of course, other forms of the term IRI could also be used.

Alternative 3a

Value-Indexed Annotation Containers seems fairly intuitive and doesn't unnecessarily overload other JSON-LD concepts. But, we'd need to consider what @container: @annotation looks like for general use cases.

But, good points on the complexity of the expanded form.

Alternative 4

Note that we've already introduced a definition of annotation objects, although your definition seems to depend on the object containing @annotated, rather than being the value of @annotation. This obviously helps with the @type issue.

Summary

Having gone through these alternatives some more, it seems to me that "Alternative 3a: Value-Indexed Annotation Containers" is also worthy of close scrutiny.

I agree, I think it comes down to these two different directions. At this stage, we shouldn't be overly biased towards one or the other, IMHO. Probably 3a is the shortest delta from our current direction.

My probably not properly consider summary:

  • Alternative 3a: rely on @container: @annotation in a context definition, and use the map-form to provide the specific annotation values. May complicate the expanded form.
  • Alternative 4: abandon the current annotation object design in favor or having the value of any property (or @type) be an object containing the @annotated key.

I'm hopping to hear @pchampin weigh in.

from json-ld-star.

pchampin avatar pchampin commented on September 26, 2024

Sorry for a late reaction. Loads of things to catch up after an offline holiday break!
Thanks a lot @niklasl for this extensive analysis.
I totally agree with @gkellogg's latest comment. And I must say I find the idea of "indexed annotation container" very elegant!

A comment on what @niklasl wrote:

I base this evaluation on the same premise that use-cases for RDF-star annotations appear to have in general: that usage, i.e. data access, should go unaltered even with the presence of annotations.

In fact, the current design of @annotation aims exactly at that. E.g. unannotated:

{ 
    "@context": "https://schema.org/",
    "name": "Alice",
    "knows": {
        "name": "Bob"
    }
}

annotated:

{ 
    "@context": "https://schema.org/",
    "name": "Alice",
    "knows": {
        "name": "Bob",
        "@annotation": {
            "#assertedBy": "#charlie"
        }
    }
}

For this reason, I tend to prefer Alternative 3a to Alternative 4.

from json-ld-star.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.