<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Understanding inheritance in the schema about eml HOT 5 CLOSED

ropensci commented on August 17, 2024

Understanding inheritance in the schema

from eml.

Comments (5)

mbjones commented on August 17, 2024

Yes, they are both defined centrally for the purpose of reuse across the schemas. So, for example, entityGroup is defined in eml-entity.xsd, and referenceGroup is defined in eml-resource.xsd. In XML Schema, an xs:group is a model containing a set of elements that are meant to be inserted in multiple places, and so the group definition provides a common way to define the set of elements once. Similar to "include" directives in many languages. See http://stackoverflow.com/questions/12431031/difference-between-group-and-sequence-in-xml-schema

from eml.

cboettig commented on August 17, 2024

@mbjones Thanks! somewhat related: can you explain the thinking with attributes? I see some elements get no attributes, some just get id, and some get id, system and scope (and maybe there are others I haven't seen).

What determines this pattern? And what exactly are system and scope for?

from eml.

mbjones commented on August 17, 2024

scope defines the space within which the identifier should be interpreted, and can be either 'document', which means the ID has no meaning outside of the current document, or 'system', in which case the id should be interpreted as being from and within the namespace of the system identified by the system attribute. system defines the URI (hopefully, or at least the name) of the system from which the identifier was drawn and within which it should be unique. For example, for DOIs, you might set it to 'http://doi.org'. If scope is set to 'document', then the identifier can be used within the document as a pointer, but has no interpretation outside of the document in some larger system. This can be used, for example, to put an ID on a <creator> elelment that is referenced later within the same document.

In general, id, system, and scope should be present on any element that allow 'id', and we allowed 'id' on any element which had a major potential for reuse or external reference. We discussed allowing ID on every element, but then you end up with pathological cases like <fieldDelimiter id="doi:10.6788/xyxyx">,</fieldDelimiter>, which we wanted to avoid because it makes parsing the EML very difficult (and starts to look an awful lot like full-blown RDF). As it is, many EML parsing applications fail to properly parse and interpret references fields, so we didn't want to make it harder. This decision is certainly debatable as to what is the right way to go.

@cboettig Can you give examples of where the id attribute is used without system and scope?

from eml.

cboettig commented on August 17, 2024

Great, that makes sense. Also makes sense that scope and system attributes
should be available whenever you have an id, so I think they are just
missing from the documentation occassionally, e.g.
http://knb.ecoinformatics.org/software/eml/eml-2.1.1/eml-attribute.html#AttributeListType

On Tue, Sep 3, 2013 at 10:47 AM, Matt Jones [email protected]:

scope defines the space within which the identifier should be
interpreted, and can be either 'document', which means the ID has no
meaning outside of the current document, or 'system', in which case the id
should be interpreted as being from and within the namespace of the system
identified by the system attribute. system defines the URI (hopefully, or
at least the name) of the system from which the identifier was drawn and
within which it should be unique. For example, for DOIs, you might set it
to 'http://doi.org'. If scope is set to 'document, then the identifier
can be used within the document as a pointer, but has no interpretation
outside of the document in some larger system. This can be used, for
example, to put an ID on a` elelment that is referenced later within the
same document.

In general, id, system, and scope should be present on any element that
allow 'id', and we allowed 'id' on any element which had a major potential
for reuse or external reference. We discussed allowing ID on every element,
but then you end up with pathological cases like ,, which we wanted to avoid
because it makes parsing the EML very difficult (and starts to look an
awful lot like full-blown RDF). As it is, many EML parsing applications
fail to properly parse and interpret references fields, so we didn't want
to make it harder. This decision is certainly debatable as to what is the
right way to go.

@cboettig https://github.com/cboettig Can you give examples of where
the id attribute is used without system and scope?

—
Reply to this email directly or view it on GitHubhttps://github.com/ropensci/reml/issues/41#issuecomment-23732622
.

Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

from eml.

mbjones commented on August 17, 2024

They are not missing from the documentation, as the documentation is automatically generated from the schema. I checked the schema, and they are indeed missing in several places. Now I'm questioning whether we had a reason for that (e.g., those id's were only meant to have document scope), or if it was a mistake. It would require some sleuthing through old email list archives to determine if it was intentional. Probably not a high priority for me unless it causes problems and needs revision in the spec.

from eml.

Understanding inheritance in the schema about eml HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent