adiwg / mdjson-schemas Goto Github PK
View Code? Open in Web Editor NEWJSON schemas, examples, and templates for ADIwg metadata standards
Home Page: http://www.adiwg.org/projects/
License: GNU Lesser General Public License v3.0
JSON schemas, examples, and templates for ADIwg metadata standards
Home Page: http://www.adiwg.org/projects/
License: GNU Lesser General Public License v3.0
Here is a first pass at a JSON structure for data dictionary info. The comment period is open!
"dataDictionary": {
"dictionaryInfo": {
"citation": {
"title": "required",
"date": [
{
"date(required)": "0000-00-00",
"dateType": "required"
}
],
"edition": "",
"responsibleParty": [
{
"contactId": "required",
"role": "required"
}
],
"onlineResource": [
{
"uri": "http://thisisanexample.com",
"protocol": "",
"name": "",
"description": "",
"function": ""
}
]
},
"resourceType": "required",
"description": "required",
"language": ""
},
"domain": [
{
"domainId": "required",
"commonName": "",
"codeName": "required",
"description": "required",
"member": [
{
"name": "required",
"value": "required",
"definition": "required"
}
]
}
],
"entity": [
{
"entityId": "",
"commonName": "",
"codeName": "required",
"definition": "required",
"primaryKeyAttributeCodeName": [""],
"index": [
{
"codeName": "required",
"allowDuplicates(required)": false,
"attributeCodeName": ["required"]
}
],
"attribute": [
{
"commonName": "",
"codeName": "required",
"definition": "required",
"dataType": "required",
"allowNull": true,
"units": "",
"domainId": "",
"minValue": "",
"maxValue": ""
}
],
"foreignKey": [
{
"localAttributeCodeName": ["required"],
"referencedEntityCodeName": "required",
"referencedAttributeCodeName": ["required"]
}
]
}
]
}
Most reference systems (which I assume define both the datum and projection) will fit into a RS_Identifier block. RS_Identifier is a MD_Identifier with added attributes codeSpace and version. After reading and re-reading the definitions I can see no reason why we need these additional attributes. I think they can be handled in the authority block (a citation) of RS_Identifier.
codeSpace => name or identifier of the person or organization responsible for namespace; looks like authority.title
verision => version identifier for the namespace; looks like authority.edition
"spatialReferenceSystem": {
"name": [" "],
"epsgNumber": [0],
"wkt": [" "],
"customReferenceSystem": {
"name": " ",
"url": " "
},
"customReferenceSystem": {
"all-the-necessary-parameters": " "
}
}
Vote the options you think we should support, or add others.
resourceTimePeriod was added to the schema, but no example(s) have been committed. See #23.
Missing ISO translations and/or examples for Resource Type, Iniative Type and Association Type
We can improve coverage by testing with the strict option. This will prevent errors when additionalProperties is set to true in the schemas. The first item in the example array should pass strict validation.
I found this section difficult to use. As a user you need to fully understand citation in order to utilize this section as one might intend. Citation forces an unnatural order of specifying a citation, who is responsible, and oh by the way I have a web site. What you really want as a user is to specify the document, what it is, describe it, supply the web link, then add a citation for the organization responsible (and their website). I’m not even quite clear how to hook this information in. I think we need to reorganize this section to a business orientation not an ISO orientation:
AdditionalDocumentation
Name:
Type:
Description:
Website:
Document date:
Provider
Name
Role
"authority" is a short version of CI_Citation. I use it many places a full citation is not needed. There are no changes from mdJson 1.0 other than those made to responsibleParty.
Discussion:
Changes:
{
"title": "",
"date": [
{
"date": "0000-00-00",
"dateType": ""
}
],
"responsibleParty": [
{
"see": "responsibility 2.0"
}
],
"onlineResource": [
{
"see": "onlineResource"
}
]
}
see ISO XML example authority -3.xml
"addressType": {
"physical": true,
"mailing": true
}
The following changes were made to the JSON Template and full example to specific about what link type is allowed by the JSON validator. Citation embedded in keyword and taxonomy were also structured to be consistent with how citation is employed elsewhere in the ADIwg JSON. Details ...
We have the cardinality reversed for resource identifier. In ISO an identifier can have multiple authorities. Schema 0.4.0 allows multiple identifier per authority.
Now:
{
"contactId": "",
"role": "",
"resourceIdentifier": [
{
"identifierName": "",
"identifier": ""
}
]
}
Should be:
{
"contactId": "",
"role": "",
"resourceIdentifier": {
"identifierName": "",
"identifier": ""
}
}
"resourceTimePeriod": {
"description":"Period of resource",
"beginPosition":"1977-12-14",
"endPosition":"2007-08-13"
}
[ ] change uri to url to match ISO requirement
[ ] drop doi - people can make the url a resolvable doi if that is the preferred method of accessing the resource
[ ] no change - specify doi as URN; ISO writer will publish doi as MD_Identifier
[ ] no change
"assignedId": [
{
"contactId": "1",
"role": "primaryLCC",
"resourceIdentifier": [
{
"identifierName": "projectId",
"identifier": "ALCC2010-05"
}
]
“constraint” is now an array and has undergone major changes from mdJson 1.0. The new elements added repeat for legal (issue #56) and security (issue #57) constraints and are not redefined in those issues.
Changes:
{
"constraint": [
{
"type": "use",
"useLimitation": [""],
"scope":{
"see": "scope 2.0"
},
"graphic": [
{
"see": "graphicOverview 2.0"
}
],
"reference": [
{
"see": "citation 2.0"
}
],
"releasability": {
"addressee": [
{
"see": "responsibility 2.0"
}
],
"statement": "",
"disseminationConstraint": ["MD_RestrictionCode"]
},
"responsibleParty": [
{
"see": "responsibility 2.0"
}
],
"legal": {
"see": "legalConstraint 2.0"
},
"security": {
"see": "securityConstraint 2.0"
}
}
]
}
see XML example useConstraint -3.xml
In our last Tuesday meeting, Chris suggested we carry 'additionalDocumentation' as gmd:MD_AggregateInformation'. Sounded like a great idea, but has a few hickups.
The best way to do this might be to ...
Options:
[ ] add association type code
[ ] don't bother with additional documentation in 19115-2
I just received an advance copy of a study conducted at U. Idaho regarding handling of TK research projects. Among other things the study lays out the requirements for metadata using the ISO standard. An important issue for them is locale (PT_Locale). This goes in the MD_Metadata or MI_Metadata section. Unfortunately we missed it, probably because it is also missing from the ISO documentation, but it is present in the XSD.
<gmd:locale>
<gmd:PT_Locale>
<gmd:languageCode>
<gmd:LanguageCode codeList="" codeListValue=""></gmd:LanguageCode>
</gmd:languageCode>
<gmd:country>
<gmd:Country codeList="" codeListValue=""></gmd:Country>
</gmd:country>
<gmd:characterEncoding>
<gmd:MD_CharacterSetCode codeList="" codeListValue=""></gmd:MD_CharacterSetCode>
</gmd:characterEncoding>
</gmd:PT_Locale>
</gmd:locale>
<gmd:locale></gmd:locale>
Locale is an array. Unfortunately it overlaps language and characterSet. In -1 language and characterSet were completely replaced by defaultLocale (PT_Locale [0..1]) and otherLocale (PT_Locale [0..*]) - same in MD_DataIdentification for -1. Making this change will break the backwards compatibility of mdJson v1.
Should we break it now rather than waiting to support -1? One other difference, gmd:language is a gco:CharacterString, gmd:languageCode is a codeList. Both the LanguageCode and CountryCode may be a pain to add to mdCodes. I'll look into it.
This will support "default" roles for formats that do not support responsibleParty, e.g. sbJson.
Here are some suggested new definitions:
Changes:
"releasability": {
"addressee": [
"see responsibility 2.0"
],
"statement": "",
"disseminationConstraint": [
"MD_RestrictionCode"
]
},
see XML example useConstraint -3.xml
CI_Responsibility was formerly CI_ResponsibleParty. The schema described here is the mdJson 2.0 schema and not part of ISO. It makes use of data maintained as individual and organization contacts saved in the mdJson contacts array.
The new ISO -3 schema centers on “role” rather than the contact.
Changes to the mdJson:
{
"role": "CI_RoleCode",
"roleExtent": [
{
"see": "extent 2.0"
}
],
"party": [
{
"contactId": "",
"organizationMembers": [
"individual contact ID"
]
}
]
}
no ISO example
Add resourceType to:
[{
"resourceType": "",
"citation": {}
}]
I have gone about as far as I can with this class diagram for now. I decided to focus on the JSON input, the part that users of the translator will interact with, rather than the internal JSON schemas. I did however, cross reference with the schemas to gather data type, multiplicity and optionality. I'm thinking this might be useful as part of our documentation set for users of the translator. The internal schemas are different enough that I decided not to pursue that as a goal.
This diagram is based on the data template. I did not address extended classes for project metadata since decisions on the JSON input are still in flex. These can be added once decisions solidify.
Things got a little messy with the geospatial part and may be worth a conversation.
Any thoughts on implementing basic gridded/raster (MD_ContentInformation) support in mdJSON & 19115-2 writer? Translated output would be something like this:
<gmd:contentInfo>
<gmi:MI_CoverageDescription>
<gmd:attributeDescription>
<gco:RecordType>Grid Cell</gco:RecordType>
</gmd:attributeDescription>
<gmd:contentType>
<gmd:MD_CoverageContentTypeCode codeList="http://www.ngdc.noaa.gov/metadata/published/xsd/schema/resources/Codelist/gmxCodelists.xml#MD_CoverageContentTypeCode" codeListValue="physicalMeasurement">physicalMeasurement</gmd:MD_CoverageContentTypeCode>
</gmd:contentType>
<gmd:dimension>
<gmd:MD_Band>
<gmd:descriptor>
<gco:CharacterString>DATATYPE IN CELL</gco:CharacterString>
</gmd:descriptor>
<gmd:maxValue>
<gco:Real>99</gco:Real>
</gmd:maxValue>
<gmd:minValue>
<gco:Real>99</gco:Real>
</gmd:minValue>
<gmd:units>
<gml:BaseUnit gml:id="gridCellID">
<gml:identifier codeSpace="local">UNITS</gml:identifier>
<gml:unitsSystem nilReason="missing"/>
</gml:BaseUnit>
</gmd:units>
</gmd:MD_Band>
</gmd:dimension>
<!-- optional -->
<gmi:rangeElementDescription>
<gmi:MI_RangeElementDescription>
<gmi:name>
<gco:CharacterString>Empty Grid Cell</gco:CharacterString>
</gmi:name>
<gmi:definition>
<gco:CharacterString>
Representation of grid cell with no measurement value.
</gco:CharacterString>
</gmi:definition>
<gmi:rangeElement>
<gco:Record>EMPTY CELL VALUE</gco:Record>
</gmi:rangeElement>
</gmi:MI_RangeElementDescription>
</gmi:rangeElementDescription>
</gmi:MI_CoverageDescription>
</gmd:contentInfo>
I've pushed the templates along with the schema and examples files to this repo. See the commit history.
As you can see, I made some changes to the project template. Some of them were from the last discussion we had. I also renamed the non-valid JSON files, they should validate as javascript.
However, that leads to a question about the handling of the geometry objects in the templates. Repeating the geometry property in the geographicElement object is not valid and therefore I cannot validate it with the JSON schema. I think it would be better to just show valid feature objects for each type of feature so that we can validate these templates.
Right now, for testing purposes, "additionalProperties" = false for most schemas. This will need to be set to true for the 1.0 release to allow users the option of extending mdJSON locally. Any additionalProperties should be ignored by the validator.
Create mdjson-schemas bower package to make it easier to use schemas in front-end applications.
I'm working through the ISO 19115-2 writer and don't remember what we decided to do with gmd:dataSetURI. It's ISO definition is "Uniformed Resource Identifier (URI) of the dataset to which the metadata applies", but it sits in metadata identification ahead of the resource metadata causing some misunderstanding.
(in -1 dataSetURI is replaced by a citation block for the metadata record - seemingly more misunderstanding)
To carry our old 'editLink' forward we put a 'metadataUri' attribute in the JSON metadataInfo section, but have no place for it -2.
So, did we decide to:
"graphicOverview" has added one new element in ISO -1.
Changes:
{
"fileName": "",
"fileDescription": "",
"fileType": "",
"fileConstraint": [
{
"see": "constraint 2.0"
}
],
"fileUri": [
{
"see": "onlineResource 2.0"
}
]
}
XML example: browseGraphic -3.xml
Add an 'alias' element to the entity and attribute sections of mdJson.
Add to entity ...
"entityId": "",
"commonName": "",
"codeName": "",
"alias": [""],
"definition": "",
"primaryKeyAttributeCodeName": [""],
"index": [
{
"codeName": "",
"allowDuplicates": false,
"attributeCodeName": [""]
}
],
"attribute": []
Add to attribute ...
"commonName": "",
"codeName": "",
"alias": [""],
"definition": "",
"dataType": "",
"required": true,
"units": "",
"domainId": "",
"minValue": "",
"maxValue": ""
Is everyone okay with above proposal?
We need to define the minimum required properties for a valid ADIwg mdJSON file.
Note: ADIwg required/recommended/optional fields are a separate issue (although I imagine ADIwg required will equate to the schema minimums).
Currently, the json-schema requires:
Our template, full_example, and mdTranslator have the file URI named "fileUri". The schema example has the URI named "uri".
Template:
"graphicOverview": [
{
"fileName": "",
"fileDescription": "",
"fileType": "",
"fileUri": "http://thisisanexample.com"
}
],
I'm not sure how this is being handled in the schema validator. Also, I changed the schema example in my sub-module but don't want to push it up from there.
Should we update these to be more in line with 19115-1?
Right now those elements are fairly useless for creating metadata relationships. We can move towards the -1 implementation, make them actually useful, and retain backwards compatibility with -2.
On our afternoon call yesterday Josh and I discovered the JSON schema and mdTranslator are out of sync regarding the 'geometry' and 'geometryCollection' options of GeoJSON.
GeoJSON supports 'geometry', 'geometryCollection', 'feature', and 'featureCollection'. The difference between geometry and feature objects is that features allow support of supplemental information such as id, name, description, and other things we decided to collect. Geometry blocks carry on the un-described geometry.
ISO does support un-described geometries. The question is should we?
Pros:
Changes:
{
"citation": [
{
"title": "",
"alternateTitle": [""],
"date": [
{
"see": "date 2.0"
}
],
"edition": "",
"responsibleParty": [
{
"see": "responsibility 2.0"
}
],
"presentationForm": [""],
"identifier": [
{
"see": "identifier 2.0"
}
],
"series": {
"see": "series 2.0"
},
"otherCitationDetails": [""],
"onlineResource": [
{
"see": "onlineResource 2.0"
}
],
"graphic": [
{
"see": "graphic 2.0"
}
]
}
]
}
see XML example citation -3.xml
Is there a reason we left this out, or was it just an oversight?
Like:
{
"id": "schema.json#",
"$schema": "http://json-schema.org/draft-04/schema#",
"version": "0.5.2"
"description": "schema for ADIwg JSON metadata",
...
}
This is about how we encode identifiers in JSON to support MD_Identifier in CI_Citation and EX_GeographicDescription. We made the decision to place identifiers in JSON as an attribute of responsible party; so that implied that ‘contact’ is equivalent to the ‘citation for the issuing authority’. I have been building skimpy citations for the authority from organizationName only of contact block, that’s all I can use from contact that fits into citation. This creates a minimal authority citation always with a full citedResponsibleParty. Almost the opposite of opposite of what we need, a good citation with optional citedResponsibleParty.
Also, if I ask the 'class_identifier' to build the MD_Identifier from responsibleParty I will always get the skimpy authority citation with full citedResponsibleParty when authority is optional in ISO. CitedResponsibleParty is also optional and I can's shut that one down either. Not going to be funny when we build separate extents for geometries with supplemental information regarding identifiers.
And we also have our previously discussed problem of repeating the long citedResponsibleParty section in metadata record's citation.
I think we need to revisit the idea of having the resourceIdentifier block embedded in responsibleParty. I could even push the idea farther and lobby for an authority array ahead of the metadata block, similar to the contacts array. When we are coding identifiers for multi-point, line, polygon geometries the authorities will become highly reused. We could then just provide the authorityID, roleCode, contactID (optional), identifierName, and identifier with each geometry.
The new MD_Identifier combines the MD_Identifier and RS_Identifier (i.e. used in imageQuality and other places in -2). I have combined the two so only a single form is supported in mdJson 2.0.
Changes:
{
"identifier": "",
"namespace": "",
"version": "",
"description": "",
"authority": {
"see": "citation 2.0"
}
}
see XML example citation -3.xml
CI_Series was part of ISO 19115-2 but not included in mdJson 1.0. I have added it to mdJson 2.0
Changes:
"series": {
"seriesName": "",
"seriesIssue": "",
"issuePage": ""
},
see XML example citation -3.xml
Here's a suggested revision to the distribution schema. It brings in additional attributes and supports a more logical structure by linking the distribution options (online/offline) with a format declaration.
The revised structure is support in both -2 and -1 with the follow exception:
The following element is not supported in -2:
In -2 mediumType is selected from a codeList (which ADIwg extended); in -1 it is citation. I did not make it a citation in this revision, the code can be pushed into a minimal citation for -1.
{
"distributionInfo": [
{
"description": "",
"distributor": [
{
"distributorContact": {
"contactId": "",
"role": ""
},
"orderProcess": [
{
"fees": "",
"plannedAvailabilityDateTime": "0000-00-00",
"orderingInstructions": "",
"turnaround": ""
}
],
"transferOptions": [
{
"distributionFormat": [
{
"formatName": "",
"version": "",
"compressionMethod": ""
}
],
"transferSize": 0.0,
"transferSizeUnits": "",
"online": [
{
"uri": "http://thisisanexample.com",
"protocol": "",
"name": "",
"description": "",
"function": ""
}
],
"offline": [
{
"mediumType": "",
"mediumCapacity": 0.0,
"mediumCapacityUnits": "",
"mediumFormat": "",
"mediumNote": ""
}
]
}
]
}
]
}
]
}
Should require one of ["beginPosition", "endPosition"], not both.
Changes:
{
"contact": [
{
"contactId": "",
"isOrganization": false,
"name": "",
"positionName": "",
"memberOfOrganization": [
"organization contact ID"
],
"logoGraphic": [
{
"see": "graphic 2.0"
}
],
"phone": [
{
"see": "phone 2.0"
}
],
"address": [
{
"see": "address 2.0"
}
],
"electronicMailAddress": [""],
"onlineResource": [
{
"see": "onlineResource 2.0"
}
],
"hoursOfService": [""],
"contactInstructions": "",
"contactType": ""
}
]
}
No ISO XML example
Changes:
"onlineResource": [
{
"uri": "http://thisisanexample.com",
"protocol": "",
"name": "",
"description": "",
"function": ""
}
],
see XML example responsibleParty -3.xml
We have a difference between implementation and schema example.
In the template dataDictionary is an object
In the reader handled as an object
In the example.json an array
In ISO 19115-1 and array (most likely). MD_Metadata can have many md_ContentInformation which can be of subtype MD_FeatureCatalogue which can have many FC_FeatureCatalogue.
So, do we want multiple data dictionaries? If so we need patch a few items.
At this point all metadata records are written in utf8. I hard coded the writer to set the appropriate ISO attribute with this value. Since it is always 'utf8' and we have no present capability to output the metadata in a different character encoding system, do we really need to specify the character set in as a variable?
Part 2: Do we need to add the ability to write in multiple character sets?
Part 3: Do we want to stub this feature into version 0.5.0 or wait until a future version?
Need to write definitions(descriptions) for dataDictionary.
For resourceInfo>status, the Translation text for ISO should be to reference MD_ProgressCode, which is a codelist. Unless this is a -1 change.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.