inspire-eu-rdf / inspire-rdf-guidelines Goto Github PK

INSPIRE data in RDF

Home Page: http://inspire-eu-rdf.github.io/inspire-rdf-guidelines/

Ruby 1.48% HTML 76.87% CSS 21.66%

inspire-rdf-guidelines's Introduction

INSPIRE data in RDF

The INSPIRE initiative aims at improving the sharing of spatial data and services between public authorities in Europe and in particular between the Member States and the European Institutions. INSPIRE addresses the interoperability of geospatial data sets and services for the exchange of data related to one of the 34 spatial data themes defined in the INSPIRE Directive. It does so through the creation of application schemas (using UML) and geospatial encodings mechanisms (using GML, GeoTIFF and other formats).

At the same time, e-Government applications and tools start to use the Linked Data paradigm, based on Semantic Web languages and technologies, such as the Resource Description Framework (RDF). If INSPIRE data was available as Linked Data, these e-Government applications and tools could easily link to it. INSPIRE Linked Data has the potential to unlock new applications and services, not only for e-Government. The information contained in data stacks that were previously separate could easily be combined to acquire new, useful information.

However: while the methodology to publish INSPIRE data as GML together with according schemas is well-defined, a methodology for the publication of INSPIRE data as Linked Data still needs to be defined. The methodology needs to cover:

the creation of RDF vocabularies representing the INSPIRE data models and
the transformation of INSPIRE data into RDF

In a first step, the ARE3NA activity of the European Commission Joint Research Center had performed a study that identified recommendations for INSPIRE in RDF as well as a number of issues that have to be resolved in order to specify a common methodology.

In a second step, draft guidelines for representing data from INSPIRE in RDF and recommendations to solve open issues have been developed by ARE3NA. This repository contains the draft guidelines.

Draft RDF vocabularies for (selected) INSPIRE spatial object types based on the draft guidelines are made available in another repository.

The current draft guidelines and draft vocabularies have been tested an reviewed mainly through the execution of two pilot projects and need wider stakeholder review.

All identified issues or comments are documented in the issue tracker. The issue tracker provides an environment where we can discuss each issue and solve it together. If you are interested, please review the issues and provide feedback or create a new issue. To create or comment on an issue you will need a GitHub account. If you are an expert in the field of Linked Data or the Web and are interested in INSPIRE, your input is highly appreciated.

inspire-rdf-guidelines's People

Contributors

Stargazers

Watchers

Forkers

andrea-perego liangcun deltaresprojects emidiostani

inspire-rdf-guidelines's Issues

RDF types for base types from ISO 19103 - types that map directly to XML Schema types

Description

INSPIRE application schemas use a number of classes from ISO/TS 19103:2005 as base types. In order to encode INSPIRE data in RDF, a mapping to an RDF implementation must be found for each of the ISO 19103 types that is used in INSPIRE.

These are (according to the Regulation 1089/2010, Annex I, Section 1(1)): Any, Angle, Area, Boolean, CharacterString, Date, DateTime, Decimal, Distance, Integer, Length, Measure, Number, Probability, Real, RecordType, Sign, UnitOfMeasure, Velocity and Volume.

For the types Boolean, CharacterString, Date, DateTime, Decimal, Integer, Number, and Real we propose to map them directly to their respective XML Schema types (xs:boolean, xs:string, xs:date, xs:dateTime, xs:integer, xs:decimal, xs:double).

NOTE: the ISO 19103 types 'Sign' and 'RecordType' are only used in the INSPIRE types DirectedLink and Coverage. Finding mappings for these types appears to be sufficient when the respective topics (network and coverage models) are implemented.

NOTE: Mappings for the remaining types are covered in issues #10 and #11.

Discussion Item

Any objection to our proposal?

URIs for INSPIRE - 303 URIs vs. hash URIs

Description

INSPIRE application schemas model real-world phenomena as features. A feature provides a particular view upon a real-world phenomenon. Within the Linked Data domain, it is important to identify both the real-world phenomenon itself and representations that describe it.

The W3C note "Cool URIs for the Semantic Web" recommends the use of 303 URIs or Hash URIs to identify resources – the actual thing / real-world phenomenon as well as representations that describe it. Both solutions have their advantages and disadvantages. They can even be combined.

When transforming INSPIRE data into RDF, clear guidance is needed for the creation and assignment of URIs for spatial objects and the real-world phenomena they represent.

Discussion Item

If you have been involved in the development of a system that publishes RDF data: with which particular URI style (303 URIs and/or Hash URIs, or even an entirely different approach) does it publish data, and what were the reasons for choosing that URI style? In other words, can you give recommendations on when/why a specific URI style is preferable?

Reuse of common Linked Data Vocabularies

Description

For several feature attributes and classes in INSPIRE application schemas, commonly used properties and classes from existing RDF vocabularies should be reused. Whenever the semantic of such a property matches that of a feature attribute, the existing property should be used instead. The same applies for classes. This requires review to ensure that the use of items from other vocabularies is appropriate.

Example: prov:generatedAtTime / prov:invalidatedAtTime from PROV-O are good candidates for beginLifespanVersion / endLifespanVersion from the INSPIRE application schemas.

We should identify vocabularies that are suited to represent specific elements of INSPIRE application schemas in RDF.

NOTE1: The following vocabularies are potential candidates: FOAF, SKOS, DCAT, VoID, ORG, CUBE, SSN, PROV-O, ISA core

NOTE2: This issue is different to the issues RDF types for base types from ISO 19103 (#9, #10, #11) and RDF types and properties for INSPIRE foundation schemas (#12): here we are concerned about finding suitable matches for types and properties that are defined by the INSPIRE schemas themselves, not by the schemas that are imported by the INSPIRE schemas.

Also, it is not entirely clear if direct reuse of external vocabularies is preferable to an explicit alignment, for example via subClassOf or equivalentClass axioms.

NOTE3: Ideally, the chosen methodology would be flexible regarding which particular vocabulary to reuse for the representation of a given INSPIRE application schema element (class or property), especially when considering the future evolution of INSPIRE RDF schema. In the future, it may be desirable to replace certain external vocabularies with more appropriate ones. The methodology should support this.

Voidable properties

Description

The voidable concept used in INSPIRE is in some ways interesting as it allows explicitly stating that something, for example the name of a road, is not known (no value, but reason 'unknown' is provided) and distinguishes this from stating that a road is known to have no name (no value and no specific reason). I.e., the INSPIRE application schemas, although generally based on the closed-world assumption, support unknown facts.

In RDF not stating a property is equivalent to setting the property to nil in the GML encoding.
RDF has no proper mechanism (that we are aware of) to explicitly state that the name of a road is unknown (to some person(s) or organizations).

NOTE: RDF adheres to the open world assumption which takes into account that, in this example, someone else may have a name for the road.

Thus, without a natural way to express such facts in an INSPIRE RDF representation, the RDF representation will state that no road name is known. While this is a loss of information, it probably is not essential for most applications.

The aim (of developing guidelines for representing INSPIRE data in RDF) should be to provide the means to share all the (positive) information a data provider has. If a certain piece of information is not available at a data provider, chances are that they simply don’t know them. The case where the absence of data is explicitly known should be quite rare.

In summary, we propose to not add a schema conversion rule for <<voidable>>, because there does not seem to be a need for it.

NOTE: if it turned out that cardinality restrictions have to be represented (see issue #18 ) then all voidable properties would have to include the default minimum cardinality 0.

URI Scheme for RDF namespaces – support in conceptual model

Description

Issues 2 and 3 are concerned with a URI schema for the namespace of INSPIRE RDF schemas. The INSPIRE conceptual model might not contain all information to automatically create an RDF namespace (with the selected URI schema) for a given INSPIRE application schema.

Discussion Item

If information is missing in the conceptual model, should new tagged values be introduced to convey it?

Purpose of AddressLocator.level

I do not see the need for the AddressLocator.level attribute (in the original INSPIRE specification).

The level is explained as follows in the INSPIRE data spec:

The locator “level” attribute classifies the level of detail expressed by this locator. The locator level will
allow a better understanding and a comparison between addresses and address locators from
different countries and regions. For example: in The Netherlands an address number identifies a
dwelling or business unit inside a building, while in many other countries an address number is
assigned to a building.
The locator level could also express that the locator identifies a dedicated postal delivery point like e.g.
a P.O. Box

and it's values are: access level, postal delivery point, site level and unit level

However, I see the same level of granularity (or finer) in LocatorDesignator.type and LocatorName.type, which are properties of the AddressLocator.

Can you provide an example that shows the use and need of AddressLocator.level?

Evaluation of existing ontologies – rules for cases where multiple candidate ontologies exist

Description

In #5 we are looking for criteria / methods / rules to decide if an existing ontology is appropriate for use in INSPIRE RDF schemas.

Discussion Item

It might happen that more than one vocabulary is suitable to represent elements of the INSPIRE conceptual schemas. How could we resolve such a situation? Would we need to agree on one particular ontology from the list, or could we use multiple? Do rules / best practices exist to handle such cases?

Including information for mapping classes and properties to common definitions in the conceptual model

Description

Issues #14 and #16 are about the mapping of INSPIRE model elements (classes and properties) to common definitions – either from the INSPIRE model itself or from external vocabularies.

We need to determine which information should be included in the conceptual model to support this mapping.

properties to link into Inspire RDF resources.

To increase the use of Inspire RDF resources as an authorative source we need to be able to link 'into' the inspire resources.

On address level there is a property from the ISA Core Location Vocabulary which enables you to link any resource to the ISA representation of a address.

Currently there is no property to link any resource to e.g. a Land Parcel, it would make sense to have a generic property and subproperties to link into specific INSPIRE Themes.

e.g. relatedToParcel this would allow e.g. a Real Estate Agent to link an online offer for a piece of land they have for sale to the official data about the parcel.
In turn this would increase the authority level of the Real Estate Agent, they refer to a authoritative data source

RDF types for base types from ISO 19103 - ‘Any’ type

Description

INSPIRE application schemas use a number of classes from ISO/TS 19103:2005 as base types. As outlined in issue #9, a mapping to an RDF implementation must be found for each of the ISO 19103 types that is used in INSPIRE so that INSPIRE data can be encoded in RDF.

We propose to map the ISO 19103 ‘Any’ type to rdfs:Class.

Discussion Item

Any objection to our proposal?

Range of code lists

§ 9.4.9. Vocabulary reference at https://inspire-eu-rdf.github.io/inspire-rdf-guidelines/#ref_cr_prop_vocabulary_reference states :

INSPIRE code lists are represented as SKOS concept schemes, and their codes as SKOS concepts (see here). Consequently, the range of a property with a code list as value type can only be specified using the generic class skos:Concept.

IMHO the assertion "the range of a property with a code list as value type can only be specified using the generic class skos:Concept" is incorrect. It would be perfectly acceptable to:

Define a subclass of skos:Concept to represent to subset of skos:Concept belonging to a particular list; e.g. my:Natura2000DesignationValue_Concept
Add a formal definition to this subclass to define it as the set of all skos:Concept that have a skos:inScheme property pointing a specified ConceptScheme, e.g. (pseudo-code) my:Natura2000DesignationValue_Concept owl:equivalentTo [ skos:inScheme value <http://inspire.ec.europa.eu/codelist/Natura2000DesignationValue> ]
use that subclass as a range for the properties

I actually do that quite often in my ontologies.

Property re-use - Rules for identifying properties with identical semantics

Description

In UML an attribute belongs to the class that defines it. Likewise, an association role is a property that also belongs to a specific class. See the following figure (ignore the arrows for now).

In RDF, a property can be described in terms of the classes (of resource(s)) to which it applies but it can also be described independently of any class.

Apparently there is a mismatch between UML and RDF: in UML a property belongs to its class while in RDF a property can be used by / in the domain of multiple classes. In the figure, the properties highlighted with arrows would be candidates for global definition within an RDF schema, to be reused by multiple classes.

When transforming INSPIRE data models to RDF vocabularies, we need to have clear guidance on how to transform class properties.

A straightforward solution would be to transform each property to its own RDF property definition, through augmentation with the class name that the property belongs to in UML. However, if multiple properties with the same name and the same semantics exist in a schema this would lead to repetition that would clearly be undesired and not in the spirit of RDF. In this case, there should be a way to identify which properties can be reused, i.e. mapped to a common property definition. Reuse can happen within a vocabulary and between multiple vocabularies. Reuse of property definitions from external vocabularies may also be an option.

Discussion Item

Which rules exist for identifying properties with identical semantics?

Aspects to consider for these rules:

If they cover all model situations
If conflicting situations can arise, and how can they be handled
Which of these rules can be automated

RDF representation of union data types

Description

A <<union>> class represents a choice between a set of different properties and their values. Unions are used in INSPIRE application schemas and need to be represented in INSPIRE RDF schemas as well.

The schema conversion rule for union data types in ISO 19150-2 is insufficient as it does not handle cases where values are a mix of object or datatypes, or the same value type is used by more than one option. The conversion rule focuses on the value types used by the properties of a union, completely ignoring the fact that the properties themselves can carry meaning and therefore must not be discarded.

An approach is needed to represent union data types in INSPIRE RDF schemas.

RDF guideliness and data specifications

Currently we have been working on a cadastral dataset.

The RDF guideliness maintain:

"This document specifies an experimental encoding rule for representing spatial data sets in INSPIRE as RDF. The use of RDF is optional and does not supersede or replace the requirements regarding encoding specified in Clause 9 of the Data Specifications. This optional encoding is intended to support the e-government and open data community in Europe, which is increasingly looking at RDF to represent data."

Does this mean that the data still has to comply to the data specifications? If so, a lot of data in the dataset will be lost, because of the compliance which is demanded by the specification. While RDF can be the solution to capture the more rich data in the inspire dataset.

Relationship between real-world phenomenon and feature document

Description

Let us assume that the representation of an INSPIRE feature in INSPIRE RDF schemas was split in two resources: 1) containing information about the real-world phenomenon, and 2) containing information about the feature document itself (feature metadata, see issue #22).

When two persistent URIs are assigned for each resource, one for the INSPIRE feature document and one for the real-world phenomenon, it would be useful to link the two resources. It has been proposed to use rdfs:isDefinedBy and foaf:isPrimaryTopicOf:

:realworldobject rdfs:isDefinedBy :featuredocument.
:featuredocument foaf:primaryTopic :realworldobject.

The use of rdfs:isDefinedBy follows the convention used in "Cool URIs for the Semantic
Web". The use of foaf:primaryTopic for the inverse statement seems to be frequently
used in the linked data world; a side effect is that :featuredocument is a foaf:Document,
but this should be appropriate.

Whether or not this proposal is appropriate and sufficient needs to be determined. In general, guidance is needed on how a relationship between two resources, one for the real-world phenomenon and one for the feature document, should be established.

Relationship between resources identifying the same real-world phenomenon

Description

In general, but even within the INSPIRE SDI itself, several digital abstractions may and generally will exist for the same real-world phenomenon. The same organisation may manage multiple datasets on different scales. For example, mapping agencies and road authorities both will usually manage data about the road network in separate datasets.

It is important to understand that there is no requirement that only a single URI is used for the real-world phenomenon - it is perfectly fine to use different URIs. Of course, it would be preferable, if only a single URI was used consistently for the same real-world phenomenon, but it would be an organisational challenge to implement the mechanisms and processes for this. It would require significant work in the INSPIRE Member States to establish the necessary governance and infrastructure.

When multiple URIs for the same real-world phenomenon exist and this is known, we propose that this fact is declared using owl:sameAs.

NOTE1: the distinction between the real-world phenomenon and the feature document (for further details, see issue #22) is essential here, because the owl:sameAs would only make the real-world phenomena the same, but not the feature documents.

NOTE2: regardless if a distinction is made between the real-world phenomenon and the feature document: if a feature type has a thematic identifier then it can be used to identify if two objects that are of this feature type describe the same real-world object.

Example of spatial object encoded using INSPIRE RDF vocabularies

The guidelines contain a number of examples, but not an example of a full spatial object / feature encoded using the INSPIRE RDF vocabularies. Such an example should be added when actual INSPIRE RDF data is available.

Representation of aggregation and composition relationships

Description

ISO 19150-2 specifies an approach to represent UML aggregation and composition relationships in RDF. There, a UML association role that is playing the parts of an aggregation or composition association is represented in RDF via an annotation property (iso19150-2:aggregationType with a value of “partOfSharedAggregation” or “partOfCompositeAggregation”).

We believe that the distinction of aggregation and composition adds no value when publishing RDF data, especially when considering the open world assumption.

Property cardinality

Description

In an application schema, cardinality specifies the number of instances of a property that may be associated with a single instance of a feature type. The following figure shows how cardinality is represented in a UML class diagram.

In RDF, OWL restrictions can be used to specify the cardinality of a property.

We are not aware of use cases involving INSPIRE data represented in RDF that would require the validation of cardinality constraints. Therefore we propose to not encode the cardinality of properties in INSPIRE RDF schemas.

Metadata on resource level

Encoding metadata in chapter 10, describes that a dataset should have meta data on the dataset using geodcat. However, we have a cadastral dataset which contains mutations on resource level The history of mutations is in the data. How should be encode this?

Real-world phenomenon vs feature document as subject of a property

Description

Most feature attributes and roles represent properties of the real-world phenomenon. According to the General Feature Model, it is common practice to model both

properties that describe the real-world phenomenon and
properties that describe the feature ("spatial object" in INSPIRE terminology) document, i.e. which are feature metadata,

as feature properties.

In INSPIRE, most properties are properties that describe the real-world phenomenon.
However, there are exceptions:

Properties that represent life-cycle information (in particular, the beginLifespanVersion and endLifespanVersion attributes) are marked with the stereotype <<lifeCycleInfo>>.
Properties that have a value type from ISO 19115 are often feature metadata. However, this is not always the case, in particular for CI types. An example is ProtectedSite.legalFoundationDocument with value type CI_Citation.
There are also some properties that require a closer review to identify them as feature metadata.

Examples are CadastralZoning.estimatedAccuracy with value type Length or CadastralZoning.originalMapScaleDenominator with value type Integer. These properties are not properties of the real-world phenomenon, but of the feature.

From the perspective of RDF vocabularies there is no distinction between the two types of properties. Nevertheless, it has impact on how instances are represented in RDF because it is important in linked data and the semantic web to be clear about the subjects. In this case we would have two subjects – the real-world phenomenon and the feature.

Discussion Item

Should the RDF encoding make a distinction between properties that describe the real-world phenomenon and properties that contain feature metadata? Are there specific advantages / disadvantages of (not) doing so?

If a distinction should be made in the encoding:

Can you identify one or more approaches how the encoding should look like? For example: maybe one or more stereotypes could be introduced to identify metadata properties.
Does the ThematicIdentifier need to be represented when considering that it is basically implemented by persistent URIs of the real-world phenomenon?

NOTE1: the Best Practices deliverable of the W3C/OGC working group may provide useful input to this discussion.

NOTE2: the reports from the study on RDF and PIDs for INSPIRE (prepared as part of the ARE3NA activity), especially the documents D.EC.1.1 and D.EC.2.1, provide additional background information for this issue.

INSPIRE spatial object type classification

Description

It is not clear which of the subjects would be typed using the feature type classification. I.e., which of the two following statements is correct:

:featuredocument rdf:type cp:CadastralParcel
:realworldobject rdf:type cp:CadastralParcel

Discussion Item

Should the real-world phenomenon, the feature document or both be typed with an INSPIRE spatial object type classification?

Upper ontologies for INSPIRE spatial object types

Description

Spatial object / feature types of INSPIRE application schemas are instances of the ISO 19109 GF_FeatureType meta-class. In the GML encoding, this relationship is expressed by all INSPIRE spatial object / feature types being in the substitution group of gml:AbstractFeature. The same is true for the GML encoding of feature types from other application schemas.

This approach supports working with any feature type, for example writing rules that must be supported by any feature, specifying service interfaces that manage any feature, and encoding data sets with any feature as member.

The question is if a similar approach can be usefully applied to INSPIRE RDF schemas and how it should look like.

Some further background information:

ISO 19150-2 specifies in subclause 7.5 that RDF representations of application schema feature types should be sub-classes of gfm:AnyFeature – which is an element of the General Feature Model vocabulary.
The OGC standard GeoSPARQL specifies geo:Feature, which is similar to gfm:AnyFeature.
The Linked Data community has defined – or may currently be in the process of defining - additional vocabularies to represent location information, which may also contain a concept that represents “any feature”.

NOTE: the General Feature Model vocabulary has not been published yet (8th of January, 2016), but a draft version is available at https://github.com/ISO-TC211/GOM/blob/master/isotc211_GOM_harmonizedOntology/19109/2005/iso19109GeneralFeatureModel.owl. We assume that once finalised, the vocabularies will be published on the ISO/TC 211 website as the base URI of the vocabulary suggests.
There is also a new version of ISO 19109 (ISO 19109:2015, published in December 2015) for which no vocabulary is available yet. However, as INSPIRE uses ISO 19109:2005 this should not be an issue.

Discussion Item

Which vocabularies – in addition to the ones that were already mentioned - define classes that implement the semantics of ISO 19109 GF_FeatureType? These vocabularies would be candidates for upper ontologies of INSPIRE RDF schema.

Range for properties with code list typed value

Description

It is good practice to manage code lists separately from an application schema. In INSPIRE, code lists are managed in the INSPIRE code list register and other registries. The INSPIRE code list register supports representations in different formats (including RDF/XML) through content negotiation.

We propose that the RDF/XML representation of INSPIRE code lists shall model these code lists using skos:ConceptScheme and code list values using skos:Concept. This includes extensions to INSPIRE code lists.

NOTE 1: Extensions of INSPIRE code lists can be published / mirrored in the INSPIRE registry. However, it is questionable if this is feasible for dynamic or very large lists.

NOTE 2: Guidelines for setting up registers for additional code list values and how to register additional values in these registers is currently being discussed in the INSPIRE working group on registers and registries.

Discussion Item

Any comments to the proposal?

Encoding of geographical names

Description

The INSPIRE data type "GeographicalName" supports the provision of multiple spellings for a name, a link to an audio file for pronunciation, and more. Applications often simply just need the name in one spelling, potentially with indication of the language. In such cases, "GeographicalName" can be mapped to a simple rdfs:Literal.

On the other hand, if complex information is available for a geographical name, instead of just a simple label, then encoding the name as an individual resource can be useful:

The SKOS properties prefLabel and altLabel can be used to provide labels for the name in multiple languages, while also distinguishing preferred from alternative labels.
Comparison of geographical names: if two spatial objects have "name" predicates with the same URI, then both spatial objects have the same geographic name. Resource equality can be asserted through a simple comparison of resource identifiers (the URIs). This appears to be a use case in the hydrology domain (see definition and description of property "geographicalName" in the spatial object type "HydroObject").

Discussion Item

Should a property with geographical name as value type be encoded with a literal or with a class as range? Are there suitable alternatives?

GeographicPosition.default is context-sensitive

ad:GeographicPosition.default is defined as follows:

ad:GeographicPosition.default
        a                owl:DatatypeProperty ;
        rdfs:comment     "NOTE As a member state may provide several positions of an address, there is a need to identify the commonly used (main) position. Preferrably, the default position should be the one with best accuracy."@en ;
        rdfs:domain      ad:GeographicPosition ;
        rdfs:range       xsd:boolean ;
        skos:definition  "Specifies whether or not this position should be considered as the default."@en .

A more RDF way of modelling this would be:

ad:Address.defaultPosition a  owl:ObjectProperty ;
    rdfs:subPropertyOf  ad:Address.position ;
    skos:definition  "The default Position of a characteristic point ..."

The first approach does not allow GeographicPositions to be reused, since the default attribute is bound to a specific context. The second approach does allow reuse.

Representation of OCL constraints

Description

The ISO/DIS 19150-2 schema conversion rule maps constraints from the UML model. It does not specify how exactly – so there is room for interpretation. Including OCL in the RDF vocabulary is questionable.

We propose to include only the documentation of an OCL constraint.

However, there is one exception: OCL constraints in subtypes that restrict the type of a property inherited from a supertype. This requires further consideration.

Necessity for Address and AddressRepresentation?

A topic I had wondered about a bit already, and was briefly touched in another discussion:

Is there a need for both Address and AddressRepresentation in the RDF model? Can they not be merged?

By using only a (subclass of) locn:Address, that contains both the simply locn properties and the specialised INSPIRE attributes, users unfamiliar with INSPIRE would not be confused as to why the structured data of an address is stored in a different class.

Property re-use - INSPIRE property dictionary

Description

Issue #14 is concerned with rules for identifying properties of INSPIRE classes that have identical semantics. The goal is to re-use property definitions in INSPIRE RDF vocabularies where appropriate and to avoid duplicate definitions.

To support this, we may want to create a common INSPIRE property dictionary, which unambiguously specifies all properties used within the INSPIRE data models. It would provide a unique place to define properties, together with any additional information like mapping to properties from external vocabularies.

Aspects to consider for such a dictionary:

How it would handle definitions from external vocabularies (e.g. if an INSPIRE property would be mapped to a property from an external vocabulary)
When it could actually be introduced
If there should only be an implicit link between properties of the INSPIRE model and dictionary entries or if the model should contain explicit link information

RDF types and properties for INSPIRE foundation schemas

Description

INSPIRE application schema make use of a number of external foundation schemas, such as EarthResourceML, GeoScienceML, and the standards from ISO TC 211. No sufficiently mature and tested RDF vocabularies exist for these types, which is a problem for any attempt to represent INSPIRE data in RDF at this time.

Without more specific RDF classes that can represent these types, the fall-back for properties using these types in INSPIRE RDF schemas would be rdfs:Class or owl:Class. We propose that in such cases, rdfs:Class should be used as the “fall-back”.

The lack of proven RDF vocabularies for concepts from INSPIRE foundation schemas is a key obstacle for using RDF for INSPIRE data.

It should be noted that some INSPIRE application schemas use types from foundation schemas that are not covered by the rules for application schemas. An example is the use of GM_Boundary in the Environmental Monitoring Facilities application schema. RDF representations may not be necessary for these types if they would be fixed in the INSPIRE schemas to comply with the rules for application schemas.

Also note that some of the foundation schemas like sensors and coverages will be deliverables of the Spatial Data on the Web Working Group of W3C/OGC (https://www.w3.org/2015/spatial/wiki/Main_Page with the deliverables OWL Time Ontology, Semantic Sensor Network Ontology, Coverages in Linked Data). These results should be used, once available, and parallel work should be avoided in the meantime.

Discussion Item

Which vocabularies exist that could be used to implement types from the external foundation schemas in RDF?

Use of skos:definition and rdfs:comment

The guidelines currently state that skos:definiton should be used for definitions and rdfs:comment.

Although I find myself in this statement, I do see a conflict in the conformance to the actual world, as all ISA/W3C that I am familiar with in a government context seem to only use rdfs:comment as the predicate to define their terms.

Some ranges are too restricted

For example:

ad:AddressComponent.alternativeIdentifier
        a                owl:DatatypeProperty ;
        rdfs:domain      ad:AddressComponent ;
        rdfs:range       xsd:string ;
        skos:definition  "External, thematic identifier of the address component spatial object, which enables interoperability with existing legacy systems or applications."@en .

Since multiple systems may be working with addresscomponents, it is likely that there will be several different identifiers in use. Of course, it is necessary to recognize which identifier belongs to which application. Currently, this can be done by subtyping this property for each application.

Instead, I would suggest changing the range to a literal, so also datatypes can be used for this, as well as allowing numerical identifiers.

URI Scheme for RDF namespaces – management aspects

Description

In #2 we propose the following URI scheme for INSPIRE RDF vocabularies: http://inspire.ec.europa.eu/ont/{app-schema-code}. The following management-related aspects are open for discussion:

Should the namespace end with a slash ("/") or hash (#")?
We have excluded version information (see, for example, http://xmlns.com/foaf/spec/#sec-evolution for a reasoning to exclude version information in the namespace URI).

Discussion Item

Which management aspects must be considered for the selected URI scheme?

Are there good reasons to include a version number in the RDF namespace?
Should a “hash namespace” or a “slash namespace” be used?

Evaluation of existing ontologies – rules to decide if an ontology is appropriate for INSPIRE schemas

Description

It is a general principle to reuse existing ontologies, if appropriate, and avoid creating new definitions.

Several of the identified issues are concerned with identifying vocabularies that can be used to represent elements of the INSPIRE conceptual schemas, i.e. classes and properties defined by and used in the INSPIRE application schemas.

NOTE: http://www.w3.org/TR/ld-bp/#VOCABULARIES and http://w3c.github.io/dwbp/bp.html#dataVocabularies provide starting points for the evaluation of vocabularies.

Discussion Item

Which criteria / methods / rules should be applied to decide if an existing ontology is appropriate for use in INSPIRE RDF schemas?

The following list is a start, but needs more work:

Maturity: Are these vocabularies complete and stable? Do they have known issues?
Governance: Are these vocabularies governed by a standards body or does another active governance mechanism exist?
Use: Are there existing implementations? What is the uptake in the community?
Compatibility: How close are the concepts in the ontology to the relevant INSPIRE concepts?
...

Address: Names can be simplified

I believe the address ontology to be overly complex in regards to the names. As an example:

Starting from an ad:Address, we follow ad:Address.component to find a ad:ThoroughfareName.
From this, we follow ad:ThoroughfareName.name to get a ad:ThoroughfareNameValue.
From this, we follow ad:ThoroughfareNameValue.name to get a ad:GeographicalName.
This GeographicalName can, according to the specs, be a rdfs:Literal or a complex type.
In case a complex type is used, the skos:prefLabel contains the value.

I believe 3, 4 and 5 can be merged together without loss of expressiveness.

ex:Name1 a ad:ThoroughfareName;
  ad:ThoroughfareName.transportLink ex:Link1, exLink2;
  ad:ThoroughfareName.name ex:NameValue1.

ex:NameValue1 a ad:ThoroughfareNameValue;
  ad:ThoroughfareName.namePart ex:Part1, ex:Part2, ex:Part3; // from ad:ThoroughfareNameValue.nameParts
  skos:prefLabel "rue de la Paix"@fr; // from GeographicalName
  skos:altLabel "rue dl Paix"@fr.

ex:Part1 a ad:PartOfName; // Could also be done using properties on 
  ad:PartOfName.part "rue";
  ad:PartOfName.type <http://inspire.ec.europa.eu/codelist/PartTypeValue/type>

Note that 2 must remain separate. This is because the names (NameValues) might consist of different parts in different languages:

Dutch: "Boudewijnlaan" - 1 part
French: "Boulevard Baudouin" - 2 parts

Description

Additional work with regards to or relevant for INSPIRE Linked Data is being or has been carried out since the issues have been identified, including:

GeoKnow project [link1, link2]
Smart Open Data (SmOD) project
W3C/OGC Spatial Data on the Web Working Group

This work will be considered as input and guidance when identifying the proper resolution to each issue.

Discussion Item

Do you know of further results that would be relevant to this effort? If so, please add a comment to this issue with:

a description of the additional results
links to further information - each link should be accompanied by a brief description what is being linked to and why it is relevant

RDF types for base types from ISO 19103 - measure types

Description

For the measure types from ISO 19103 (Angle, Area, Distance, Length, Measure, Probability, Velocity and Volume), a single mapping is most likely sufficient and could be used for all Measure data types from ISO 19103.

We also need a mapping for the type UnitOfMeasure.

NOTE: The model of ISO 19103 classes sometimes is rather elaborate. GML shows that an actual implementation of these types does not necessarily need to include all information defined for them. A mapping between the information items supported by a particular implementation schema (such as GML) to the conceptual schema (in this case ISO 19103) is also feasible. Due to restrictions of the implementation technology the mapping may even not be 100% complete.

NOTE: the following lists may provide useful starting points for the required mappings

Mapping ‘Measure’:
Mapping ‘UnitOfMeasure’

Discussion Item

Do you have a suggestion how the measure types from ISO 19103 can be mapped in an INSPIRE RDF implementation?

Split up bundled comments

As an example:

ad:ThoroughfareNameValue.nameParts
        a                owl:ObjectProperty ;
        rdfs:comment     "NOTE 1 This is a definition which is consistent with that adopted by the UPU\r\n\r\nNOTE 2 A subdivision of a thoroughfare name into semantic parts could improve parsing (e.g. of abbreviated or misspelled names) and for sorting of address data for example for postal delivery purposes. It could also improve the creation of alphabetically sorted street gazetteers. \r\n\r\nNOTE 3 The data type requires that each part of the subdivided thoroughfare name is qualified with information on the semantics e.g. if it is a thoroughfare type (e.g., Rua, Place, Calle, Street), a prefix (e.g., da, de la, del), a qualifier (e.g., Unterer, Little) or if it is the core of the name, which would normally be used for sorting or indexing. \r\n\r\nNOTE 4 In some countries or regions and for some thoroughfare names it is not feasible or it does not add value to subdivide the thoroughfare name into parts.\r\n\r\nEXAMPLE In France the thoroughfare name \"Avenue de la Poste\" could be subdivided into these parts: \"Avenue\" + \"de la\" + \"Poste\"."@en ;

The comment in this case consists of 4 different notes (one of which I would call an editorial note) and an example.

I suggest splitting up the note into different assertions, each with an appropriate predicate.

Incorrect use of lcn:geographicName?

Section 9.3.1 of the draft guidelines states:

It is required that properties with INSPIRE GeographicalName as value type are aligned with the ISA Programme Location Core Vocabulary property "geographicName".

This is demonstrated in eg:

ad:AddressRepresentation.postName
        a                   owl:ObjectProperty ;
        rdfs:domain         ad:AddressRepresentation ;
        rdfs:range          ad:GeographicalName ;
        rdfs:subPropertyOf  locn:geographicName ;

However, locn:geographicName is defined as "[...] a proper noun applied to a spatial object."

This seems to imply that AddressRepresentation can be seen as a spatial object. This contradicts its definition: "Representation of an address spatial object for...". It is a representation of a spatial object, not an actual spatial object.

Context of this remark: the Flemish government is finishing a similar exercise (OSLO²) to model their models in RDF while complying to INSPIRE. In this exercise we saw incompatibilities between INSPIRE and locn.

GeographicalName subtypes skos:Concept

GeographicalName is currently defined as:

gn:GeographicalName  a   owl:Class ;
        rdfs:subClassOf  skos:Concept ;
        skos:definition  "Proper noun applied to a real world entity."@en .

I believe this subclassing is done to have compatibility with locn:geographicName (see the last sentence in the comment).

locn:geographicName a rdf:Property ;
    rdfs:label "geographic name"@en ;
    rdfs:comment """
A geographic name is a proper noun applied to a spatial object. Taking the example used in the relevant INSPIRE data specification (page 18), the following are all valid geographic names for the Greek capital:
...
For INSPIRE-conformant data, provide the metadata for the geographic name using a skos:Concept as a datatype."""@en ;

However, RDF defines the term "datatype" as the datatype of a literal, of which xsd:string is an example. As such, I always assumed geographicname would be used as follows:

ex:myAdminUnitName ad:name "Brussels"^^ex:brusselsDT.

ex:brusselsDT a rdfs:Datatype ;
  rdfs:subClassOf rdfs:Literal ;
  gn:pronunciation "brʌs(ə)lz".

I am not sure if this is what core location originally intended, but it would be a more unified way since there is no mixup of GeographicalName as literal and as non-literal.

SHACL and INSPIRE

We were wondering how we could validate whether the data complies to the RDF inpire guideliness. How could we detect flaws. Then we thought, since everything is in RDF, it would be possible (theoretically) to validate the data using SHACL shapes (https://www.w3.org/TR/shacl/). Shapes will detect violations of the guideliness, which will be very helpfull for people which converted their data to INSPIRE RDF.

owl:import URIs are broken in examples

E.g. this example

<http://inspire.ec.europa.eu/ont/cp> owl:imports <https://github.com/inspire-eu-rdf/inspire-rdf-vocabularies/blob/master/ad/ad.ttl> , ...

would never work, because https://github.com/inspire-eu-rdf/inspire-rdf-vocabularies/blob/master/ad/ad.ttl resolves to a GitHub HTML page and not RDF.

A proper solution would of course be to find a more persistent URI, e.g. on purl.org or europa.eu or w3.org, and redirect to files on GitHub behind the scenes.
But if we want to continue linking directly to GitHub, it should be the "raw" file URI: https://raw.githubusercontent.com/inspire-eu-rdf/inspire-rdf-vocabularies/master/ad/ad.ttl

URI Scheme for RDF namespaces – selecting the URI scheme

Description

RDF vocabularies representing INSPIRE application schemas need to have unique namespaces.
INSPIRE application schemas already contain tagged values (called "targetNamespace") with namespace URIs to use in the GML/XML Schema representation. An example of such a namespace is http://inspire.ec.europa.eu/schemas/el-bas/3.0 for the “ElevationBaseTypes” application schema. These namespaces are created based on a URI scheme that uses codes to identify the application schema.

We propose the following URI scheme for INSPIRE RDF vocabularies (which is similar but not equal to the URI scheme of INSPIRE GML Schemas): http://inspire.ec.europa.eu/ont/{app-schema-code}

NOTE 1: Re-using the XML Schema namespace URIs for the INSPIRE RDF vocabularies has been considered. However, such an approach can lead to URI resolution conflicts (when trying to access an XML Schema and RDF vocabulary at the exact same URI) and has therefore been rejected.

NOTE 2: ISO 19150-2 specifies a pattern for namespaces of RDF vocabularies using a combination of a base URI and the schema package name (e.g. http://def.isotc211.org/iso19107/2003/geometryroot#). While this pattern can be used to automatically derive an RDF namespace from the application schema package name, the proposed URI scheme leads to namespaces that:

Are shorter
Are more aligned with the already established namespaces for GML/XML Schema
Are independent of the package structure within the application schema
Can convey the version of the application schema that the vocabulary represents, but can also omit it

Discussion Item

Any comments to the proposed URI scheme?

AddressComponents are features?

ad:AddressComponent and its subclasses (ad:AdminUnitName, ad:AddressAreaName, ...) are subclasses of gsp:Feature.

I find the definition of gsp:Feature itself to be lacking in that it simply refers to ISO 19107 - a paying specification. Either way, there are 2 major interpretations of what a Feature is: a geographical feature, or a broader term.

In case the first is meant, why is something that represents a name a geographical entity? That does not make sense.

In case the broader term is followed which includes non-geographical features, a features is simply a "record". One can ask if not every RDF class is in fact a feature, rendering this addition useless.

Representation of association classes

Description

Association classes are not supported by the schema conversion rules in ISO/DIS 19150-2, but are required for INSPIRE application schemas.

The GML 3.3 standard defines a pattern to transform an association class into equivalent UML model elements (consisting of UML classes and associations only). A similar pattern is suggested in a note of the Semantic Web Best Practices and Deployment Working Group.

We propose that, in general, the approach taken in GML 3.3 should also be used for deriving INSPIRE RDF schemas.

However, each of the rare cases where association classes are used in the INSPIRE conceptual schemas should be reviewed to determine if an optimized mapping to RDF exists.

inspire-eu-rdf / inspire-rdf-guidelines Goto Github PK

inspire-rdf-guidelines's Introduction

INSPIRE data in RDF

inspire-rdf-guidelines's People

Contributors

Stargazers

Watchers

Forkers

inspire-rdf-guidelines's Issues

Description

Discussion Item

Description

Discussion Item

Description

Description

Description

Discussion Item

Description

Discussion Item

Description

Description

Discussion Item

Description

Discussion Item

Description

Description

Description

Description

Description

Description

Discussion Item

Description

Discussion Item

Description

Discussion Item

Description

Discussion Item

Description

Discussion Item

Description

Description

Description

Discussion Item

Description

Discussion Item

Description

Discussion Item

Description

Discussion Item

Description

Discussion Item

Description

Discussion Item

Description

Recommend Projects

Recommend Topics

Recommend Org