Git Product home page Git Product logo

wcmp2's Introduction

wcmp2's People

Contributors

amienshxq avatar amilan17 avatar antje-s avatar d1mach avatar gaubert avatar josusky avatar jsieland avatar solson-nws avatar tomkralidis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wcmp2's Issues

simplify descriptions of conformance classes

Change descriptions like:
"A WCMP record has a properties.description property, which is a free-text summary description of the dataset."
to
"The description property is a free text summary of the dataset."

Clarification of terminology used

A clear definition of what is a requirement and what is a recommendation (or even permission) in the document is currently missing although both are widely used. A clear and distinct clarification of terminology used simplifies uptake of the document and reduces misinterpretations.

Identifier question

Is this the unique id of the dataset or metadata record?

"A WCMP record utilizes the OARec id property to provide a unique identifier to a dataset. A record identifier is essential for querying and identifying records within the GDC."

links element

This is becoming very overloaded, but given there are no other proper location I would assume that e.g. links to WIGOS representations in OSCAR would go here as well? How would these easily be identified?

question on 7.1.2 Granularity section

It looks like this section describes best practices for representing granularity in 7.1.9 Themes, it doesn't provide information about the structure of the standard. Should it be moved into 7.1.9?

Granularity section

  1. The section on granularity feels incomplete and, in general, needs more explanation.

  2. Are we conflating the concepts of scope and granularity? I think scope of a WCMP2 record is a dataset and the scope of a WMDR record can be station, instrument or observation(s). Whereas, granularity is how a dataset is represented and the level of detail about a dataset in the WCMP2 record.

Consider making record update date optional

Make this field optional or add guidance about how to use when the record is first created and doesn't have an update yet. 

---

7.1.16. Record Update Date

A WCMP record has a properties.recordUpdated property, which describes the date that the record was last updated.

"recordUpdated": "2022-06-12T18:52:39Z"
Requirement 12/req/core/record_update_date
AA WCMP record SHALL provide a properties.recordUpdated property.

review and provide recommendation on future of WCMP

Summary and Purpose

WCMP Status

  • WCMP has been in existence since the creation of WIS (~2012)
  • WCMP provides:
    • a profile based on ISO 19115/19139 (i.e. not a new model or schema)
    • an abstract test suite to help determine compliance
    • a dedicated set of custom codelists (some extending ISO 19115, some new/dedicated to WMO)
  • KPIs have been implemented in support of WCMP quality assessment for GISCs

WIS

  • WIS 1.0 has been operational since ~2012
  • WIS 2.0 principles (summary as part of initial WIS 2.0 search pilot project)
    • lower the barrier to entry
    • FAIR data principles
    • Web architecture/hypermedia
    • webby/of the web
    • search engine friendly
    • the browser is the search engine
    • schema.org

Current landscape

  • XML is slowly being replaced by JSON (of the web, tight coupling with programming language)
  • mass market integration is a priority
  • cross pollination with another information communities (extending the reach)
  • Service-oriented architecture (SOA) -> Resource-oriented architecture (ROA)
  • other organizations such as OGC are undergoing efforts to move to modernized approaches (i.e OGC API)

Proposal

@wmo-im/tt-wismd to assess and evaluate options for the future of WCMP against established criteria of requirements, e.g.:

  • do nothing, use existing WCMP 1.3
  • move to ISO 19115-3
  • consider other metadata standards (DCAT2, OGC API - Records, etc.)

Criteria

  • WIS 2.0 alignment
  • content
  • migration
  • backwards compatibility
  • change management
  • interoperability
  • cross pollination with other communities
  • timelines for WMO endorsement

Reason

As TT-WISMD, we need to put forth next steps in realizing discovery in alignment with WIS 2.0 and current state.

cc @6a6d74 @joergklausen

WIGOS metadata integration: referencing a set of stations in a WCMP2 record based on a network of stations

@wmo-im/tt-wismd @wmo-im/tt-wigosmd @joergklausen @steingod @thineshsornalingam

Consider a WCMP2 record describing, say, surface weather observations from a network of stations/facilities. In WCMP2, we are able to enumerate 1..n links, i.e:

(NOTE: relation types are examples/for brainstorming)

{
    "rel": "http://www.wmo.int/def/rel/wmdr/1.0/ObservingFacility",
    "type": "application/xml",
    "title": "Station report for Thessaloniki (Greece)",
    "href": "https://oscar.wmo.int/surface/rest/api/wmd/download/0-20008-0-THE"
}

In the above, a user can discover a dataset collection and determine from the link relation (rel) the association station(s) providing the dataset. Now imagine a observing network with 800 stations.

WCMP2 requires an approach whereby a data provider can provide a facility set as a single link in order to scale.

{
    "rel": "http://www.wmo.int/def/rel/wmdr/1.0/FacilitySet",
    "type": "application/xml",
    "title": "Station report for facility set 123",
    "href": "TBD"
}

Questions:

  1. what do folks think about this approach? This would require data providers to define such sets in OSCAR/Surface
  2. can the OSCAR/Surface API provide a link a facility set?

Aside: while the above examples use the OSCAR/Surface API, in theory the value of the href is opaque here, so long as a WMDR facility set is provided in the response.

Looking forward to this important discussion.

Type question

Verify that there are two types of "type".

  1. 7.1.1. Validation
    • "type" with value of "Feature"
  2. 7.1.5. Properties Type
    • "properties.type" with value of "Dataset" or "Service"

clarify extents against OARec record model

The current OARec record model contains the following updates (per opengeospatial/ogcapi-records#168):

  • properties.extent has been removed form the model as follows:
    • spatial has been removed in lieu of using the native GeoJSON geometry object
      • can be any geometry
      • coordinate reference system is fixed to WGS84
      • fixed/implied to a SINGLE geometry object
    • temporal has been removed in lieu of a time object (at the same level as the geometry object)
      • temporal reference system is fixed to Gregorian
      • there is no longer a resolution property

Example snippet:

    "geometry": {
        "type": "Polygon",
        "coordinates": [
            [  [ -142.012, 28.034 ],
               [ -142.012, 82.4567 ],
               [ -52.0876, 82.4567 ],
               [ -52.0876, 28.034 ],
               [ -142.012, 28.034 ]
            ]
        ]
    },
    "time": "2021-10-30"
  • pros
    • smaller overall payload
    • less duplication
    • uses core GeoJSON geometry primitives
    • others?
  • cons
    • no ability to emit multiple geometries or times
    • reference systems are fixed/constrained/less flexible
    • representing a bbox in a GeoJSON geometry would be a Polygon (a bit more encoding)
    • others?

Options:

  1. cover the most common use case and implement the OGC API - Records update (this option would still warrant the addition of a resolution property)
  2. keep the properties.extent object as specific to WCMP2
  3. keep the properties.extent object, using only temporal
  4. keep the properties.extent object as OPTIONAL, for those wishing to provide additional extents, with the baseline being geometry as WGS84 and time as Gregorian for uniform/unified search/present/evaluate workflow.
  5. other options?

@wmo-im/tt-wismd looking forward to inputs and recommendations.

clarify topic hierarchy

Currently, WMO topic hierarchy is expressed as a theme/concept as follows:

{
  "concepts": [
    {
      "id": "wis2/CAN/eccc-msc/data/core/weather/surface-based-observations"
    }
  ],
  "scheme": "https://github.com/wmo-im/wis2-topic-hierarchy"
}

As discussed during TT-WISMD 2022-12-16, given 1..n concepts are allowed, the model potentially allows for 1..n topics to be applied to a dataset. We need to apply one topic to a dataset.

Should we strongly define the topic hierarchy with properties.wmo:topicHierarchy (single string value)? This would eliminate any ambiguity, and remove the need to define the scheme (which would be implied). Example:

"properties": {
  
  "wmo:topicHierarchy": "wis2/CAN/eccc-msc/data/core/weather/surface-based-observations"

}

cc @wmo-im/tt-wismd

Change order of content in property descriptions

Move XML examples to after the tables of requirements and recommendations. 

example:

7.1.7. Title

A WCMP record’s properties.title property is a human-readable name for a given dataset.

Requirement 5/req/core/title
AA WCMP record SHALL provide a properties.title property.
"title": "Surface weather observations"

validate identifier and topic hierarchy similarities

@wmo-im/tt-wismd : while implementing the WCMP2 ETS in pywcmp, the following was flagged/noticed:

A WCMP2 record's id is defined by using some of the same constructs of properties.wmo:topicHierarchy. Example:

identifier:

"id": "urn:x-wmo:md:can:eccc-msc:observations.swob"

topic hierarchy:

"wmo:topicHierarchy": "origin/a/wis2/can/eccc-msc/data/core/weather"

Requirements 2C and 2D specify validating part of the identifier against the topic hierarchy for the country and centre-id components/values.

Does a WCMP2's identifier (id) need to have a format/convention, or can/should it be, say, a GUID?

Concerning section 5.5 Use of HTTPS

Doesn't it make sense to make an explicit recommendation to use HTTPS? In our systems we disable harvesting of information from services that are straight HTTP for security reasons.

Remove "Core" from the mandatory conformance class name?

The mandatory Conformance Class for WCMP is:
• "WMO Core Metadata Profile Core": This conformance class inherits from OGC API — Records — Part 1: Core: Requirements Class: Record Core which defines the requirements for a catalogue record. The requirements specified in the Requirements Class “Record Core" are mandatory for all implementations of WMCP. The requirements are specified in Chapter 7 and in Annex A.2 in more detail.

table view of wcmp2

Provide a view of the standard in tables such that each property in a property object is listed including the conditionality and type of content allowed.
For example:

Property Type Requirement Description
id string required A unique identifier to the dataset
type string (code) required A fixed value denoting the record as a GeoJSON Feature
conformsTo string (URL) required The version of WCMP associated that the record conforms to
geometry object required Geospatial location associated with the dataset, in a geographic coordinate reference system
time object required Temporal extent associated with a dataset
additionalExtents object optional
links object required Online linkages to data retrieval or additional resources associated with the dataset

geometry object

Property Type Requirement Description
geometry object required Geospatial location associated with the dataset, in a geographic coordinate reference system
type string (code) required point, polygon,...?
coordinates array (numbers) required

WCMP vs WCMP2

Change all references of WCMP to WCMP2? Or use WCMP version 2 in the introduction and keep WCMP as the short name throughout the document?

require 3 types of providers

I think there should be 3 types of contacts required. point of contact for the distrubution of the dataset and metadata contact for the creation of the record. The other provider should be the originator of the dataset, but in my experience from working at an archive, the originator is no longer "in charge" of the dataset and it was often included without contact information.

Reference

https://github.com/wmo-im/wcmp2/blob/main/standard/requirements/core/REQ_providers.adoc

current REQ. B: "A WCMP record properties.providers property SHALL provide at least TWO providers (as multiple provider objects or a single provider object with multiple roles) based on the metadata point of contact and the originator of the data. Providers are defined as either a URI or inline."

change link wmo:topic to channel

The OGC MetOceanDWG is working on a discussion paper on PubSub in OGC API standards. AsyncAPI is a key component of the recommendations of the discussion paper. It is hoped that the discussion paper will result in efforts for an OGC API - PubSub standard.

AsyncAPI is a PubSub complement to OpenAPI and allows for abstraction of PubSub protocols (MQTT, AMQP, etc.). AsyncAPI abstracts the MQTT term "topic" for "channel", in support of multiple PubSub protocols. See https://www.asyncapi.com/docs/concepts/channel for more information.

In WCMP2, we provide link objects with an optional wmo:topic property to be able to convey the MQTT topic that is related to the dataset.

To align with AsyncAPI as well as the directions in OGC future plans and broader interoperability, propose to change as follows:

Current:

        {
            "rel": "OASIS:MQTT",
            "href": "mqtts://example.org",
            "wmo:topic": "cache/a/wis2/can/eccc-msc/data/core/weather/observations/surface-land/landFixed",
            "type": "application/json",
            "title": "Data notifications"
        }

Proposed:

        {
            "rel": "OASIS:MQTT",
            "href": "mqtts://example.org",
            "channel": "cache/a/wis2/can/eccc-msc/data/core/weather/observations/surface-land/landFixed",
            "type": "application/json",
            "title": "Data notifications"
        }

@jsieland @josusky @gaubert @Amienshxq @solson-nws comments?

"property" definition

It would be useful to the reader if there was an explanation on what a "property" is. Sentences like this may be confusing for the reader: "A WCMP record SHALL provide a 'properties.providers' property".

clarify keywords and themes/concepts

WCMP2 (via OARec) provides both keywords and themes/concepts properties. These are both used as (optional) catalogue queryables. The general idea is that keywords provides a list of free form terms/tags, whereas themes/concepts provide terms from controlled vocabularies.

It may be challenging for users to know which to use for which purpose, so we should make clear which to use and when/for what purpose.

review discovery metadata and search standards

Summary and Purpose
As part of our efforts on the next version of WCMP (#101), we need to perform an analysis and inventory or metadata and search standards used in various portals/websites in the earth science domain.

Proposal
@wmo-im/tt-wismd members to provide examples of search portals/website in their respective countries (and/or other countries if they are involved/aware).

Reason
Having a better idea of the use of search and metadata standards will help in the definition of future WCMP as well as search standards for WIS 2.0.

cc @efucile @amilan17 @joergklausen

add recommendations on data quality

Informal feedback from KNMI (2022-08-08):

Should a WCMP record does not have room for some sort of quality indication? Maybe you have discussed this already in length, but it would make it easier for instance to be able to distinguish between quality controlled datasets from NMHC’s and real-time observation data or 3rd Party Data. Or different climate datasets

Past/relevant data quality packages:

create all in one schema bundle and document validation

  • create GitHub Actions workflow that:
    • on pull request: validates schema and all examples
    • on push: all steps in pull request, and then auto builds all-in-one bundle, pushing to repository

Examples of above (at least for schema bundling, which triggers validation):

Remove references to services?

I think we should remove services in explanatory text since this standard is focused on describing data as defined in the WMO data policy (Resolution 1). Alternatively, we can rephrase, for e.g, "data and their services".

-- 

6.1.3. Granularity

In order to provide discovery metadata of value, it is important to clarify the granularity levels of which providers are to provide describing their data/services. Articulating the level of granularity will reduce catalogue "pollution" and bring the user closer to the data via their search criteria.

7.1.6. Properties Type
WCMP records provide descriptive information about different resource types, such as dataset or services.

How https://www.w3.org/TR/dwbp/ is used in the specification

The document links to https://www.w3.org/TR/dwbp/, but would it make sense to make a clear mapping between section 8.2 there which recommends the following descriptive metadata fields:

  • The title and a description of the dataset.
  • The keywords describing the dataset.
  • The date of publication of the dataset.
  • The entity responsible (publisher) for making the dataset available.
  • The contact point for the dataset.
  • The spatial coverage of the dataset.
  • The temporal period that the dataset covers.
  • The date of last modification of the dataset.
  • The themes/categories covered by a dataset.
  • The title and a description of the distribution.
  • The date of publication of the distribution.
  • The media type of the distribution.

and the current WMCP 2.0 model? The elemnts above are easier to understand for most as they map to clearly defined features of the discovery metadata. This model is sometimes hard to identify in the current draft document. In combination with links in Table 2 in section 7 this would improve guidance to users.

make DOI recommendation more abstract

Informal feedback from KNMI (2022-08-08):

7.1.12 Digital Object Identifier => I understand what is meant, but it may be better to avoid the “DOI” in this case, as you want to leave room for other persistent identifiers, such as Handle of ARK. So I would suggest to call this “ Persistent Identifier” or Resolvable Persistent Identifier. And then mention DOI as one of the examples.

clarify parent child relationships

While the dataset is the critical path for WCMP2, some users will create child documents, and will require the ability to relate parent/child documents. One option is to consider link relations (up, related, collection, item, etc.) to associate records.

cc @steingod

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.