WMO Core Metadata Profile 2
- View drafts: https://wmo-im.github.io/wcmp2
WMO Core Metadata Profile 2
Home Page: https://wmo-im.github.io/wcmp2
WMO Core Metadata Profile 2
The schema reference in section 5.3 seems to be wrong.
Change descriptions like:
"A WCMP record has a properties.description property, which is a free-text summary description of the dataset."
to
"The description property is a free text summary of the dataset."
A clear definition of what is a requirement and what is a recommendation (or even permission) in the document is currently missing although both are widely used. A clear and distinct clarification of terminology used simplifies uptake of the document and reduces misinterpretations.
As part of the next version of WCMP, we need to identify user stories that capture goals and acceptance criteria. The user stories will capture what a user does/needs as part of their work, and will help clarify user needs as we continue to assess metadata standards.
Is this the unique id of the dataset or metadata record?
"A WCMP record utilizes the OARec id property to provide a unique identifier to a dataset. A record identifier is essential for querying and identifying records within the GDC."
This is becoming very overloaded, but given there are no other proper location I would assume that e.g. links to WIGOS representations in OSCAR would go here as well? How would these easily be identified?
Summary and Purpose
Clause-6 Overview
GitHub Branch
https://github.com/wmo-im/wcmp2/tree/Clause-6
It looks like this section describes best practices for representing granularity in 7.1.9 Themes, it doesn't provide information about the structure of the standard. Should it be moved into 7.1.9?
@wmo-im/tt-wismd as discussed, this issue to track comments and change requests for 2.0.0-alpha2.
The document is attached here for access: wcmp2.pdf
The section on granularity feels incomplete and, in general, needs more explanation.
Are we conflating the concepts of scope and granularity? I think scope of a WCMP2 record is a dataset and the scope of a WMDR record can be station, instrument or observation(s). Whereas, granularity is how a dataset is represented and the level of detail about a dataset in the WCMP2 record.
Make this field optional or add guidance about how to use when the record is first created and doesn't have an update yet.
A WCMP record has a properties.recordUpdated
property, which describes the date that the record was last updated.
"recordUpdated": "2022-06-12T18:52:39Z"
Requirement 12 | /req/core/record_update_date |
A | A WCMP record SHALL provide a properties.recordUpdated property. |
@wmo-im/tt-wismd to assess and evaluate options for the future of WCMP against established criteria of requirements, e.g.:
As TT-WISMD, we need to put forth next steps in realizing discovery in alignment with WIS 2.0 and current state.
Is this integrated in the temporal extent/data policy elements or would it be possible to identify this in an easier approach? Or out of scope?
@wmo-im/tt-wismd @wmo-im/tt-wigosmd @joergklausen @steingod @thineshsornalingam
Consider a WCMP2 record describing, say, surface weather observations from a network of stations/facilities. In WCMP2, we are able to enumerate 1..n links, i.e:
(NOTE: relation types are examples/for brainstorming)
{
"rel": "http://www.wmo.int/def/rel/wmdr/1.0/ObservingFacility",
"type": "application/xml",
"title": "Station report for Thessaloniki (Greece)",
"href": "https://oscar.wmo.int/surface/rest/api/wmd/download/0-20008-0-THE"
}
In the above, a user can discover a dataset collection and determine from the link relation (rel
) the association station(s) providing the dataset. Now imagine a observing network with 800 stations.
WCMP2 requires an approach whereby a data provider can provide a facility set as a single link in order to scale.
{
"rel": "http://www.wmo.int/def/rel/wmdr/1.0/FacilitySet",
"type": "application/xml",
"title": "Station report for facility set 123",
"href": "TBD"
}
Questions:
Aside: while the above examples use the OSCAR/Surface API, in theory the value of the href
is opaque here, so long as a WMDR facility set is provided in the response.
Looking forward to this important discussion.
In https://github.com/wmo-im/wcmp2/blob/main/schema/wcmpRecordGeoJSON.yaml#L132, a colon is needed after required, so it should read required:
Verify that there are two types of "type".
The current OARec record model contains the following updates (per opengeospatial/ogcapi-records#168):
properties.extent
has been removed form the model as follows:
spatial
has been removed in lieu of using the native GeoJSON geometry
object
temporal
has been removed in lieu of a time
object (at the same level as the geometry
object)
Example snippet:
"geometry": {
"type": "Polygon",
"coordinates": [
[ [ -142.012, 28.034 ],
[ -142.012, 82.4567 ],
[ -52.0876, 82.4567 ],
[ -52.0876, 28.034 ],
[ -142.012, 28.034 ]
]
]
},
"time": "2021-10-30"
Options:
resolution
property)properties.extent
object as specific to WCMP2properties.extent
object, using only temporal
properties.extent
object as OPTIONAL, for those wishing to provide additional extents, with the baseline being geometry as WGS84 and time as Gregorian for uniform/unified search/present/evaluate workflow.@wmo-im/tt-wismd looking forward to inputs and recommendations.
As discussed at TT-WISMD 2022-07-13, WCMP2 provides an informative section on granularity. We need normative text in the form of recommendations for clause 7.
@wmo-im/tt-wismd
WCMP2 providers have a role
property that should be defined by a controlled vocabulary in the WCMP2 codelists.
cc @steingod
Currently, WMO topic hierarchy is expressed as a theme/concept as follows:
{
"concepts": [
{
"id": "wis2/CAN/eccc-msc/data/core/weather/surface-based-observations"
}
],
"scheme": "https://github.com/wmo-im/wis2-topic-hierarchy"
}
As discussed during TT-WISMD 2022-12-16, given 1..n concepts are allowed, the model potentially allows for 1..n topics to be applied to a dataset. We need to apply one topic to a dataset.
Should we strongly define the topic hierarchy with properties.wmo:topicHierarchy
(single string value)? This would eliminate any ambiguity, and remove the need to define the scheme
(which would be implied). Example:
"properties": {
"wmo:topicHierarchy": "wis2/CAN/eccc-msc/data/core/weather/surface-based-observations"
}
cc @wmo-im/tt-wismd
Move XML examples to after the tables of requirements and recommendations.
example:
A WCMP record’s properties.title
property is a human-readable name for a given dataset.
Requirement 5 | /req/core/title |
A | A WCMP record SHALL provide a properties.title property. |
"title": "Surface weather observations"
@wmo-im/tt-wismd : while implementing the WCMP2 ETS in pywcmp, the following was flagged/noticed:
A WCMP2 record's id
is defined by using some of the same constructs of properties.wmo:topicHierarchy
. Example:
identifier:
"id": "urn:x-wmo:md:can:eccc-msc:observations.swob"
topic hierarchy:
"wmo:topicHierarchy": "origin/a/wis2/can/eccc-msc/data/core/weather"
Requirements 2C and 2D specify validating part of the identifier against the topic hierarchy for the country and centre-id components/values.
Does a WCMP2's identifier (id
) need to have a format/convention, or can/should it be, say, a GUID?
Doesn't it make sense to make an explicit recommendation to use HTTPS? In our systems we disable harvesting of information from services that are straight HTTP for security reasons.
Clarify recordCreated
should be a single property of the date of the last change (i..e it is not repeatable/cannot be used to document history of changes).
cc @steingod
@wmo-im/tt-wismd : we should develop a number of examples to help users navigate WCMP 2. Can each TT member produce 2 examples.
Example records should be placed in https://github.com/wmo-im/wcmp2/tree/main/examples
Summary and Purpose
Clause-8
GitHub Branch
https://github.com/wmo-im/wcmp2/tree/Clause-8
The mandatory Conformance Class for WCMP is:
• "WMO Core Metadata Profile Core": This conformance class inherits from OGC API — Records — Part 1: Core: Requirements Class: Record Core which defines the requirements for a catalogue record. The requirements specified in the Requirements Class “Record Core" are mandatory for all implementations of WMCP. The requirements are specified in Chapter 7 and in Annex A.2 in more detail.
Provide a view of the standard in tables such that each property in a property object is listed including the conditionality and type of content allowed.
For example:
Property | Type | Requirement | Description |
---|---|---|---|
id | string | required | A unique identifier to the dataset |
type | string (code) | required | A fixed value denoting the record as a GeoJSON Feature |
conformsTo | string (URL) | required | The version of WCMP associated that the record conforms to |
geometry | object | required | Geospatial location associated with the dataset, in a geographic coordinate reference system |
time | object | required | Temporal extent associated with a dataset |
additionalExtents | object | optional | |
links | object | required | Online linkages to data retrieval or additional resources associated with the dataset |
Property | Type | Requirement | Description |
---|---|---|---|
geometry | object | required | Geospatial location associated with the dataset, in a geographic coordinate reference system |
type | string (code) | required | point, polygon,...? |
coordinates | array (numbers) | required |
Change all references of WCMP to WCMP2? Or use WCMP version 2 in the introduction and keep WCMP as the short name throughout the document?
I think there should be 3 types of contacts required. point of contact for the distrubution of the dataset and metadata contact for the creation of the record. The other provider should be the originator of the dataset, but in my experience from working at an archive, the originator is no longer "in charge" of the dataset and it was often included without contact information.
Reference
https://github.com/wmo-im/wcmp2/blob/main/standard/requirements/core/REQ_providers.adoc
current REQ. B: "A WCMP record properties.providers
property SHALL provide at least TWO providers (as multiple provider objects or a single provider object with multiple roles) based on the metadata point of contact and the originator of the data. Providers are defined as either a URI or inline."
The OGC MetOceanDWG is working on a discussion paper on PubSub in OGC API standards. AsyncAPI is a key component of the recommendations of the discussion paper. It is hoped that the discussion paper will result in efforts for an OGC API - PubSub standard.
AsyncAPI is a PubSub complement to OpenAPI and allows for abstraction of PubSub protocols (MQTT, AMQP, etc.). AsyncAPI abstracts the MQTT term "topic" for "channel", in support of multiple PubSub protocols. See https://www.asyncapi.com/docs/concepts/channel for more information.
In WCMP2, we provide link objects with an optional wmo:topic
property to be able to convey the MQTT topic that is related to the dataset.
To align with AsyncAPI as well as the directions in OGC future plans and broader interoperability, propose to change as follows:
Current:
{
"rel": "OASIS:MQTT",
"href": "mqtts://example.org",
"wmo:topic": "cache/a/wis2/can/eccc-msc/data/core/weather/observations/surface-land/landFixed",
"type": "application/json",
"title": "Data notifications"
}
Proposed:
{
"rel": "OASIS:MQTT",
"href": "mqtts://example.org",
"channel": "cache/a/wis2/can/eccc-msc/data/core/weather/observations/surface-land/landFixed",
"type": "application/json",
"title": "Data notifications"
}
@jsieland @josusky @gaubert @Amienshxq @solson-nws comments?
It would be useful to the reader if there was an explanation on what a "property" is. Sentences like this may be confusing for the reader: "A WCMP record SHALL provide a 'properties.providers' property".
WCMP2 (via OARec) provides both keywords
and themes/concepts
properties. These are both used as (optional) catalogue queryables. The general idea is that keywords
provides a list of free form terms/tags, whereas themes/concepts
provide terms from controlled vocabularies.
It may be challenging for users to know which to use for which purpose, so we should make clear which to use and when/for what purpose.
Summary and Purpose
As part of our efforts on the next version of WCMP (#101), we need to perform an analysis and inventory or metadata and search standards used in various portals/websites in the earth science domain.
Proposal
@wmo-im/tt-wismd members to provide examples of search portals/website in their respective countries (and/or other countries if they are involved/aware).
Reason
Having a better idea of the use of search and metadata standards will help in the definition of future WCMP as well as search standards for WIS 2.0.
Informal feedback from KNMI (2022-08-08):
Should a WCMP record does not have room for some sort of quality indication? Maybe you have discussed this already in length, but it would make it easier for instance to be able to distinguish between quality controlled datasets from NMHC’s and real-time observation data or 3rd Party Data. Or different climate datasets
Past/relevant data quality packages:
Examples of above (at least for schema bundling, which triggers validation):
Aligning with WIS 2 Manual Guide (https://wmo-im.github.io/wis2-guide/guide/wis2-guide-DRAFT.html#_why_are_datasets_so_important)
Summary and Purpose
Clause-10
GitHub Branch
https://github.com/wmo-im/wcmp2/tree/Clause-10
Summary and Purpose
Clause-0
GitHub Branch
https://github.com/wmo-im/wcmp2/tree/Clause-0
I think we should remove services in explanatory text since this standard is focused on describing data as defined in the WMO data policy (Resolution 1). Alternatively, we can rephrase, for e.g, "data and their services".
--
In order to provide discovery metadata of value, it is important to clarify the granularity levels of which providers are to provide describing their data/services. Articulating the level of granularity will reduce catalogue "pollution" and bring the user closer to the data via their search criteria.
7.1.6. Properties Type
WCMP records provide descriptive information about different resource types, such as dataset or services.
Summary and Purpose
Clause-7
GitHub Branch
https://github.com/wmo-im/wcmp2/tree/Clause-7
The document links to https://www.w3.org/TR/dwbp/, but would it make sense to make a clear mapping between section 8.2 there which recommends the following descriptive metadata fields:
and the current WMCP 2.0 model? The elemnts above are easier to understand for most as they map to clearly defined features of the discovery metadata. This model is sometimes hard to identify in the current draft document. In combination with links in Table 2 in section 7 this would improve guidance to users.
Informal feedback from KNMI (2022-08-08):
7.1.12 Digital Object Identifier => I understand what is meant, but it may be better to avoid the “DOI” in this case, as you want to leave room for other persistent identifiers, such as Handle of ARK. So I would suggest to call this “ Persistent Identifier” or Resolvable Persistent Identifier. And then mention DOI as one of the examples.
We need a GitHub Action to build the pdf and docx representations on every push to the repository in standard/**
The resulting files should be in:
cc @amilan17
While the dataset is the critical path for WCMP2, some users will create child documents, and will require the ability to relate parent/child documents. One option is to consider link relations (up
, related
, collection
, item
, etc.) to associate records.
cc @steingod
A WCMP2 release will create the following artifacts:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.