Git Product home page Git Product logo

mdjson-schemas's People

Contributors

chris-macdermaid avatar dependabot[bot] avatar hmaier-fws avatar jlblcc avatar jwaspin avatar mikegiddens avatar stansmith907 avatar timothypage avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mdjson-schemas's Issues

Extend schema for data dictionary.

Here is a first pass at a JSON structure for data dictionary info. The comment period is open!

    "dataDictionary": {
        "dictionaryInfo": {
            "citation": {
                "title": "required",
                "date": [
                    {
                        "date(required)": "0000-00-00",
                        "dateType": "required"
                    }
                ],
                "edition": "",
                "responsibleParty": [
                    {
                        "contactId": "required",
                        "role": "required"
                    }
                ],
                "onlineResource": [
                    {
                        "uri": "http://thisisanexample.com",
                        "protocol": "",
                        "name": "",
                        "description": "",
                        "function": ""
                    }
                ]
            },
            "resourceType": "required",
            "description": "required",
            "language": ""
        },
        "domain": [
            {
                "domainId": "required",
                "commonName": "",
                "codeName": "required",
                "description": "required",
                "member": [
                    {
                        "name": "required",
                        "value": "required",
                        "definition": "required"
                    }
                ]
            }
        ],
        "entity": [
            {
                "entityId": "",
                "commonName": "",
                "codeName": "required",
                "definition": "required",
                "primaryKeyAttributeCodeName": [""],
                "index": [
                    {
                        "codeName": "required",
                        "allowDuplicates(required)": false,
                        "attributeCodeName": ["required"]
                    }
                ],
                "attribute": [
                    {
                        "commonName": "",
                        "codeName": "required",
                        "definition": "required",
                        "dataType": "required",
                        "allowNull": true,
                        "units": "",
                        "domainId": "",
                        "minValue": "",
                        "maxValue": ""
                    }
                ],
                "foreignKey": [
                    {
                        "localAttributeCodeName": ["required"],
                        "referencedEntityCodeName": "required",
                        "referencedAttributeCodeName": ["required"]
                    }
                ]
            }
        ]
    }

Handling of datum and projection in ADIwg mdJSON and ISO

Most reference systems (which I assume define both the datum and projection) will fit into a RS_Identifier block. RS_Identifier is a MD_Identifier with added attributes codeSpace and version. After reading and re-reading the definitions I can see no reason why we need these additional attributes. I think they can be handled in the authority block (a citation) of RS_Identifier.

codeSpace => name or identifier of the person or organization responsible for namespace; looks like authority.title
verision => version identifier for the namespace; looks like authority.edition

"spatialReferenceSystem": {
    "name": [" "],
    "epsgNumber": [0],
    "wkt": [" "],
    "customReferenceSystem": {
        "name": " ",
        "url": " "
    },
    "customReferenceSystem": {
        "all-the-necessary-parameters": " "
    }
}
  • name => a list of recognized reference system names that will be placed in the RS_Identifier>code
  • epsgNumber => the epsg number only. The number will be formatted in the RS_Identifier>code as urn:ogc:def:crs:EPSG::number
  • wkt => the wkt string will be prefaced in the RS_Identifier>code with WKT:string
  • customReferenceSystem => not supported
  • customReferenceSystem => provide link to external ISO 19111 block through NOAA docucomp or other service. Information will be placed in tag attributes of referenceSystemInfo as we are currently doing.
  • customReferenceSystem => collect all reference system parameters but not present in ISO. Could be used in FGDC construction.

Vote the options you think we should support, or add others.

Strict testing

We can improve coverage by testing with the strict option. This will prevent errors when additionalProperties is set to true in the schemas. The first item in the example array should pass strict validation.

AdditionalDocumentation

I found this section difficult to use. As a user you need to fully understand citation in order to utilize this section as one might intend. Citation forces an unnatural order of specifying a citation, who is responsible, and oh by the way I have a web site. What you really want as a user is to specify the document, what it is, describe it, supply the web link, then add a citation for the organization responsible (and their website). I’m not even quite clear how to hook this information in. I think we need to reorganize this section to a business orientation not an ISO orientation:
AdditionalDocumentation
Name:
Type:
Description:
Website:
Document date:
Provider
Name
Role

authority 2.0

"authority" is a short version of CI_Citation. I use it many places a full citation is not needed. There are no changes from mdJson 1.0 other than those made to responsibleParty.

Discussion:

  1. Should we keep "authority" or just open every opportunity to the full CI_Citation?

Changes:

  1. "responsibleParty" has changed to support the division of individual and organization contacts. See "responsibility 2.0" (issue #52).
{
   "title": "",
   "date": [
      {
         "date": "0000-00-00",
         "dateType": ""
      }
   ],
   "responsibleParty": [
      {
         "see": "responsibility 2.0"
      }
   ],
   "onlineResource": [
      {
         "see": "onlineResource"
      }
   ]
}

see ISO XML example authority -3.xml

Handling of Links (uri, url, urn)

The following changes were made to the JSON Template and full example to specific about what link type is allowed by the JSON validator. Citation embedded in keyword and taxonomy were also structured to be consistent with how citation is employed elsewhere in the ADIwg JSON. Details ...

  • onlineResource>url -> uri (ISO requires URL but the XSD accepts any string)
  • metadataInfor>metadataURL -> metadataUri
  • graphicOverview>fileLink -> fileUri
  • verticalElement>verticalCRSLink -> verticalCRSUri
  • keyword>thesaurus>thesaurusLink (removed becomes onlineResource of citation)
  • keyword>thesaurus>citation{} (removed level, all attributes belonging to citation were moved up directly under thesaurus - which is a citation)
  • keyword>thesaurus>onlineResource (added onlineResource to thesaurus - a citation - to be consistent with -1 and citation elsewhere in JSON)
  • taxonomy>classificationSystem>onlineResource (added onlineResource to classificationSystem - a citation - to be consistent with -1 and how citation is elsewhere in JSON)

resourceIdentifier cardinality problem

We have the cardinality reversed for resource identifier. In ISO an identifier can have multiple authorities. Schema 0.4.0 allows multiple identifier per authority.

Now:

   {
      "contactId": "",
      "role": "",
      "resourceIdentifier": [
         {
            "identifierName": "",
            "identifier": ""
         }
      ]
   }

Should be:

   {
      "contactId": "", 
       "role": "",
       "resourceIdentifier": {
          "identifierName": "",
          "identifier": ""
      }
   }

Add resourceTimePeriod

                      "resourceTimePeriod": {
                          "description":"Period of resource",
                          "beginPosition":"1977-12-14",
                          "endPosition":"2007-08-13"
                        }

Handling of DOIs

Places where doi occurs:

  • onlineResource{}
  • additionalIdentifier{}
  • and may be placed in "identifier" of resourceIdentifier{}
onlineResource{} occurs in
  • conatact[]
  • resourceInfo>citation (but not displayed in 19115-2)
  • resourceInfo>resourceIdentifier (MD_Identifier under citation in ISO)
  • resourceInfo>extent>geographicElement>properties>assignedID[]
  • distribution>online
  • associatedResource>resourceCitation
  • associatedResource>resourceIdentifier (MD_Identifier under citation in ISO)
  • associatedResource>metadataCitation
  • additionalDocumentation
additionalIdentifier{} occurs in
  • resourceInfo>citation
  • resourceInfo>keywords (as citation)
  • resourceInfo>taxonomy (as citation)
  • resourceInfo>dataQualityInfo>lineage>source>citation
  • associatedResource>resourceCitation
  • associatedResource>metadataCitation
  • additionalDocumentation (as citation)
identifier of MD_Identifier occurs in
  • resourceInfo>resourceIdentifier
  • resourceInfo>extent>geographicElement>properties>assignedId[]
  • associatedResource>resourceIdentifier

Rules:

  • onlineResource in 19115-1 and -2 is a URL
  • onlineResource describes 'other' resources
  • MD_Identifier specifies identifiers by which the resource is known
  • additionalIdentifiers are URNs (e,g, urn:isbn:123654 or urn:doi:10/1256.321475266)

Recommendations:

for onlineResource ...

[ ] change uri to url to match ISO requirement
[ ] drop doi - people can make the url a resolvable doi if that is the preferred method of accessing the resource

for additionalIdentifier ...

[ ] no change - specify doi as URN; ISO writer will publish doi as MD_Identifier

for resourceIdentifier ...

[ ] no change

Use contactRef for feature assignedId

     "assignedId": [
          {
            "contactId": "1",
            "role": "primaryLCC",
            "resourceIdentifier": [
              {
                "identifierName": "projectId",
                "identifier": "ALCC2010-05"
              }
            ]

constraint 2.0

“constraint” is now an array and has undergone major changes from mdJson 1.0. The new elements added repeat for legal (issue #56) and security (issue #57) constraints and are not redefined in those issues.

Changes:

  1. "type" is an ADIwg enumeration of type string. Valid values are [ use | legal | security ]. If type = "legal" the "legalConstraint" object must be present and filled in. If type = "security" the "securityConstraint" object must be present and filled in.
  2. "scope" is a new object ("constraintApplicationScope" in ISO) of type MD_Scope. Definition: "spatial and/or temporal extent and or level of the application of the constraint restrictions." I have limited the extent to either description or temporal extents; and only time period within temporal extents. See issue #100.
  3. “graphic” is a new array of type MD_BrowseGraphic. Definition “graphic/symbol indicating the constraint”. This will use the mdJson browseGraphic 2.0 (issue #45).
  4. “reference” is a new array of type CI_Citation. Definition: “citation for the limitation or constraint. Example Copyright statement.” I used the abbreviated form of citation we created for authority (issue #54).
  5. “releasability” is a new object of type MD_Releasability (issue #51). Definition: “information concerning the parties to who the resource can or cannot be released." The following elements are supported...
    • "addressee" is a new array of CI_Responsibility (issue #52). Definition: "party to which the release statement applies".
    • "statement" is a new scalar of type character. Definition: "release statement".
    • "disseminationConstraint" is a new array of type MD_RestrictionCode. Definition: "component in determining releasability."
  6. “responsibleParty” is a new array of type CI_Responsibility (issue #52). Definitions: “party(s) responsible for the resource constraint”.
  7. "legal" is an object of type "legalConstraint". Required if "type" = "legal".
  8. "security" is an object of type "securityConstraint". Required if "type" = "security".
{
   "constraint": [
      {
         "type": "use",
         "useLimitation": [""],
         "scope":{
            "see": "scope 2.0"
         },
         "graphic": [
            {
               "see": "graphicOverview 2.0"
            }
         ],
         "reference": [
            {
               "see": "citation 2.0"
            }
         ],
         "releasability": {
            "addressee": [
               {
                  "see": "responsibility 2.0"
               }
            ],
            "statement": "",
            "disseminationConstraint": ["MD_RestrictionCode"]
         },
         "responsibleParty": [
            {
               "see": "responsibility 2.0"
            }
         ],
         "legal": {
            "see": "legalConstraint 2.0"
         },
         "security": {
            "see": "securityConstraint 2.0"
         }
      }
   ]
}

see XML example useConstraint -3.xml

Handling' Additional Documentation' as 'Aggregate Information'

In our last Tuesday meeting, Chris suggested we carry 'additionalDocumentation' as gmd:MD_AggregateInformation'. Sounded like a great idea, but has a few hickups.

  • Besides citation, aggregateInformation requires 2 additional parameters not supplied in our JSON; 'association type' and 'initiative type'.
  • Association type is required and there is currently no suitable value in this codelist we could use, it would need to be extended. 'Inititative type also has no suitable values but is not required - so not a problem.

The best way to do this might be to ...

  • extend the association type codelist to support a new code (inserting 'other' or 'supporting') rather than modify the adiwgJson - which would not be clean for 19115-1;
  • have the 19115-2 additionalDocumentation writer automatically insert the missing association type.

Options:
[ ] add association type code
[ ] don't bother with additional documentation in 19115-2

Add PT_Locale to meet requirements of Traditional Knowledge (TK)

I just received an advance copy of a study conducted at U. Idaho regarding handling of TK research projects. Among other things the study lays out the requirements for metadata using the ISO standard. An important issue for them is locale (PT_Locale). This goes in the MD_Metadata or MI_Metadata section. Unfortunately we missed it, probably because it is also missing from the ISO documentation, but it is present in the XSD.

   <gmd:locale>
       <gmd:PT_Locale>
            <gmd:languageCode>
                <gmd:LanguageCode codeList="" codeListValue=""></gmd:LanguageCode>
            </gmd:languageCode>
            <gmd:country>
                <gmd:Country codeList="" codeListValue=""></gmd:Country>
            </gmd:country>
            <gmd:characterEncoding>
                <gmd:MD_CharacterSetCode codeList="" codeListValue=""></gmd:MD_CharacterSetCode>
            </gmd:characterEncoding>
        </gmd:PT_Locale>
   </gmd:locale>
   <gmd:locale></gmd:locale>

Locale is an array. Unfortunately it overlaps language and characterSet. In -1 language and characterSet were completely replaced by defaultLocale (PT_Locale [0..1]) and otherLocale (PT_Locale [0..*]) - same in MD_DataIdentification for -1. Making this change will break the backwards compatibility of mdJson v1.

Should we break it now rather than waiting to support -1? One other difference, gmd:language is a gco:CharacterString, gmd:languageCode is a codeList. Both the LanguageCode and CountryCode may be a pain to add to mdCodes. I'll look into it.

Definitions that need work

Here are some suggested new definitions:

  • additionalDocumentation: "other documents related to, but not defining, the resource such as factsheets, data catalog pages, award documents, proposals, and informational websites"
  • associatedResource: "other resources which are directly related to the central resource such as parent or child datasets or projects."
  • associationType: "identifies how the associated resource is related to the central resource such as 'is a component of', 'larger work citation', 'sub-project', etc."
  • initiativeType: "Identifies type of initiative under which the resource was produced - the activity that resulted in the resource."
  • graphicOverview: "provides a link to images, maps, flow charts, data models, etc. that visually help to understand the resource."
  • graphicOverview.fileName: "name of the file that contains a graphicOverview for the resource." Note: the current definition references browseGraphic which is the ISO name rather than the mdJson name graphicOverview.
  • graphicOverview.fileDescription: Note: change browseGraphic to graphicOverview
  • graphicOverview.fileType: "the format in which the illustration is encoded such as GIF, JPEG, PBM, PS, TIFF."
  • graphicOverview.fileURI: Note: change browseGraphic to graphicOverview
  • dataQuality: "describes the data quality, lineage, and/or processing steps that were applied to the whole or part of the data resource"
  • dataQuality.lineage: "procedural (non-quantitative) data quality information about the portion of the data resource identified in the dataScope"
  • dataQuality.lineage.statement: "a general statement of the actions taken to verify, transform, repair, and integrate the data described in the dataScope"
  • dataQuality.lineage.processStep: "a brief statement describing an individual, non-trivial process or methodology step taken in development of the resource data described in the dataScope"
  • dataQuality.lineage.source: "provides information about the source data used in creating the data specified in the dataScope"
  • dataQuality.lineage.source.description: "a brief description about the source dataset used in creating the data specificed in the scope"
  • dataQuality.lineage.source.citation: " a citation providing information about the source dataset including a link or other access instructions"
  • dataQuality.lineage.source.processStep: "a description of a non-trivial event or transformation taken to prepare the source data for use as the data specified in the scope"
  • resourceSpecificUsage.specificUsage: "a brief description about how the resource is being used"
  • resourceSpecificUsage.userDerterminedLimitations: "a brief description of applications determined by the user for which the resource is not suitable"
  • resourceSpecificUsage.userContactInfo: "identification of persons and/or organizations and their roles that are using this resource"
  • distributorInfo.distributionTransferOptions: "technical means and media by which a resource is obtained from the distributor"
  • extent.geographicElement: "an array of objects each describing a geographic boundary or location comprising all or portion of the resource"
  • extent.temporalElement: "an array of objects each describing a temporal boundary or location comprising all or portion of the resource"
  • extent.verticalElement: "an array of objects each describing a vertical boundary or location comprising all or portion of the resource"
  • extent.geographicElement.featureScope: "a brief statement describing the portion of the extent covered by this geographic element"
  • taxonomy.voucher.specimen: "a word or phrase describing the type of specimen collected. e.g. 'herbarium specimens', 'blood samples', 'photographs', 'individuals'".
  • extent.temporalElement: "Temporal context for the extent".
  • temporalElement.timeInstant.id: "a user provided unique identifier for the timeInstant".
  • temporalElement.timeInstant: "a dateTime associated with an ID and description".
  • temporalElement.timeInstant.description: "a brief description providing relevant information about the date and time".
  • temporalElement.timePeriod: "object identifies a period of time relevant to the extent".
  • temporalElement.timePeriod.id: "a user provided unique identifier for the timePeriod".
  • temporalElement.timePeriod.description: "a brief description providing relevant information about the datetime period".

releasability 2.0

Changes:

  1. MD_Releasability is a new class. Definition: “information about resource release constraints.”
  2. “addressee” is an array of type CI_Responsibility. Definition: “party to which the release statement applies.” Issue #52.
  3. “statement” is a scalar text filed. Definition: “release statement.”
  4. “disseminationConstraint” is a array of MD_RestrictionCode. Definition: “component in determining releasability”.
  5. The "releasability" record must have at least one "addressee" or "statement".
      "releasability": {
         "addressee": [
            "see responsibility 2.0"
         ],
         "statement": "",
         "disseminationConstraint": [
            "MD_RestrictionCode"
         ]
      },

see XML example useConstraint -3.xml

responsibility 2.0

CI_Responsibility was formerly CI_ResponsibleParty. The schema described here is the mdJson 2.0 schema and not part of ISO. It makes use of data maintained as individual and organization contacts saved in the mdJson contacts array.

The new ISO -3 schema centers on “role” rather than the contact.

Changes to the mdJson:

  1. "roleExtent" is a new array of EX_Extent objects added to define the extent for the role. ISO allows for full geographic, vertical, and temporal extents but I have limited mdJson to the elements of "description" and "temporalExtent [ ]", ("timeInstant", "timePeriod"). However, "geographicExtent [ ] and verticalExtent [ ] could be included with no changes to the software, it's ready to receive these.
  2. “party” is a new array added to identify all the individual and/or organization contacts associated with the role.
    • “partyType” is a new enumerated type added (individual or organization) to identify the type of party being described.
    • “organizationMembers” is a new array of individual contactId(s) identifying individuals that have an association with an organization contact. Valid only for party with partyType = “organization”.
{
   "role": "CI_RoleCode",
   "roleExtent": [
      {
         "see": "extent 2.0"
      }
   ],
   "party": [
      {
         "contactId": "",
         "organizationMembers": [
            "individual contact ID"
         ]
      }
   ]
}

no ISO example

Add resourceType, initiativeType

  • Remove schema{ } > metadata{ } > metadataInfo{ } > metadataScope[ ]
  • Add schema{ } > metadata{ } > associatedResource[ ] > object{ } > initiativeType[ ]

Add resourceType to:

  • schema{ } > metadata{ } > associatedResource[ ]
  • schema{ } > metadata{ } > additionalDocumentation[ ]
  • schema{ } > metadata{ } > resourceInfo{ }

  • Restructure schema{ } > metadata{ } > associatedResource[ ] like:
[{
  "resourceType": "",
  "citation": {}
}]

JSON Input Class Diagram

I have gone about as far as I can with this class diagram for now. I decided to focus on the JSON input, the part that users of the translator will interact with, rather than the internal JSON schemas. I did however, cross reference with the schemas to gather data type, multiplicity and optionality. I'm thinking this might be useful as part of our documentation set for users of the translator. The internal schemas are different enough that I decided not to pursue that as a goal.

This diagram is based on the data template. I did not address extended classes for project metadata since decisions on the JSON input are still in flex. These can be added once decisions solidify.

Things got a little messy with the geospatial part and may be worth a conversation.

adiwg json class diagram

Basic support for gridded data

Any thoughts on implementing basic gridded/raster (MD_ContentInformation) support in mdJSON & 19115-2 writer? Translated output would be something like this:

<gmd:contentInfo>
    <gmi:MI_CoverageDescription>
        <gmd:attributeDescription>
            <gco:RecordType>Grid Cell</gco:RecordType>
        </gmd:attributeDescription>
        <gmd:contentType>
            <gmd:MD_CoverageContentTypeCode codeList="http://www.ngdc.noaa.gov/metadata/published/xsd/schema/resources/Codelist/gmxCodelists.xml#MD_CoverageContentTypeCode" codeListValue="physicalMeasurement">physicalMeasurement</gmd:MD_CoverageContentTypeCode>
        </gmd:contentType>
        <gmd:dimension>
            <gmd:MD_Band>
                <gmd:descriptor>
                    <gco:CharacterString>DATATYPE IN CELL</gco:CharacterString>
                </gmd:descriptor>
                <gmd:maxValue>
                    <gco:Real>99</gco:Real>
                </gmd:maxValue>
                <gmd:minValue>
                    <gco:Real>99</gco:Real>
                </gmd:minValue>
                <gmd:units>
                    <gml:BaseUnit gml:id="gridCellID">
                        <gml:identifier codeSpace="local">UNITS</gml:identifier>
                        <gml:unitsSystem nilReason="missing"/>
                    </gml:BaseUnit>
                </gmd:units>
            </gmd:MD_Band>
        </gmd:dimension>
        <!--  optional  -->
        <gmi:rangeElementDescription>
            <gmi:MI_RangeElementDescription>
                <gmi:name>
                    <gco:CharacterString>Empty Grid Cell</gco:CharacterString>
                </gmi:name>
                <gmi:definition>
                    <gco:CharacterString>
Representation of grid cell with no measurement value.
</gco:CharacterString>
                </gmi:definition>
                <gmi:rangeElement>
                    <gco:Record>EMPTY CELL VALUE</gco:Record>
                </gmi:rangeElement>
            </gmi:MI_RangeElementDescription>
        </gmi:rangeElementDescription>
    </gmi:MI_CoverageDescription>
</gmd:contentInfo>

Changes to the Metadata Templates

@dwalt, @stansmith907

I've pushed the templates along with the schema and examples files to this repo. See the commit history.

As you can see, I made some changes to the project template. Some of them were from the last discussion we had. I also renamed the non-valid JSON files, they should validate as javascript.

However, that leads to a question about the handling of the geometry objects in the templates. Repeating the geometry property in the geographicElement object is not valid and therefore I cannot validate it with the JSON schema. I think it would be better to just show valid feature objects for each type of feature so that we can validate these templates.

Set "additionalProperties" to true.

Right now, for testing purposes, "additionalProperties" = false for most schemas. This will need to be set to true for the 1.0 release to allow users the option of extending mdJSON locally. Any additionalProperties should be ignored by the validator.

What happened to dataSetURI?

I'm working through the ISO 19115-2 writer and don't remember what we decided to do with gmd:dataSetURI. It's ISO definition is "Uniformed Resource Identifier (URI) of the dataset to which the metadata applies", but it sits in metadata identification ahead of the resource metadata causing some misunderstanding.

(in -1 dataSetURI is replaced by a citation block for the metadata record - seemingly more misunderstanding)

To carry our old 'editLink' forward we put a 'metadataUri' attribute in the JSON metadataInfo section, but have no place for it -2.

So, did we decide to:

  • Use the first onlineResource in the resource citation as the dataSetURI?
  • Or should we add an attribute to the JSON resourceInfo section?
  • Or assume the ISO definition is wrong and slip in our metadataUri?
  • Or drop support of the field?

graphicOverview 2.0

"graphicOverview" has added one new element in ISO -1.

Changes:

  1. “fileConstraint” object was added to graphicOverview to identify, legal, security, and other constraint requirements associated with use and distribution of the graphic.
  2. “fileURI” for graphicOverview is now a full onlineResource (issue #50) rather than a simple URI string.
{
   "fileName": "",
   "fileDescription": "",
   "fileType": "",
   "fileConstraint": [
      {
         "see": "constraint 2.0"
      }
   ],
   "fileUri": [
      {
         "see": "onlineResource 2.0"
      }
   ]
}

XML example: browseGraphic -3.xml

Add alias to data dictionary entity section

Add an 'alias' element to the entity and attribute sections of mdJson.

Add to entity ...

        "entityId": "",
        "commonName": "",
        "codeName": "",
        "alias": [""],
        "definition": "",
        "primaryKeyAttributeCodeName": [""],
        "index": [
          {
            "codeName": "",
            "allowDuplicates": false,
            "attributeCodeName": [""]
          }
        ],
        "attribute": []

Add to attribute ...

            "commonName": "",
            "codeName": "",
            "alias": [""],
            "definition": "",
            "dataType": "",
            "required": true,
            "units": "",
            "domainId": "",
            "minValue": "",
            "maxValue": ""

Is everyone okay with above proposal?

Define minimum required properties

We need to define the minimum required properties for a valid ADIwg mdJSON file.

Note: ADIwg required/recommended/optional fields are a separate issue (although I imagine ADIwg required will equate to the schema minimums).

Currently, the json-schema requires:

  • schema{ } > version{ }
  • schema{ } > contact[1]
  • schema{ } > metadata{ } > resourceInfo{ }
    • schema{ } > metadata{ } > resourceInfo{ } > citation{ }
    • schema{ } > metadata{ } > resourceInfo{ } > abstract
    • schema{ } > metadata{ } > resourceInfo{ } > status
    • schema{ } > metadata{ } > resourceInfo{ } > language[ ]
    • schema{ } > metadata{ } > resourceInfo{ } > pointOfContact[1]

Graphic Overview - browserGraphic uri name

Our template, full_example, and mdTranslator have the file URI named "fileUri". The schema example has the URI named "uri".

Template:

      "graphicOverview": [
        {
          "fileName": "",
          "fileDescription": "",
          "fileType": "",
          "fileUri": "http://thisisanexample.com"
        }
      ],

I'm not sure how this is being handled in the schema validator. Also, I changed the schema example in my sub-module but don't want to push it up from there.

Upgrading metadataIdentifier and parentMetadataIdentifier

Should we update these to be more in line with 19115-1?

  • metadataIdentifier is a MD_Identifier
  • parentMetadataIdentifier is a Citation

Right now those elements are fairly useless for creating metadata relationships. We can move towards the -1 implementation, make them actually useful, and retain backwards compatibility with -2.

GeoJSON unidentified geometry objects

On our afternoon call yesterday Josh and I discovered the JSON schema and mdTranslator are out of sync regarding the 'geometry' and 'geometryCollection' options of GeoJSON.

GeoJSON supports 'geometry', 'geometryCollection', 'feature', and 'featureCollection'. The difference between geometry and feature objects is that features allow support of supplemental information such as id, name, description, and other things we decided to collect. Geometry blocks carry on the un-described geometry.

ISO does support un-described geometries. The question is should we?
Pros:

  • Descriptive fields are not required by ISO
  • Descriptive fields didn't really exist in FGDC
  • Third party generated GeoJSON could be dropped into the ADIwg JSON
  • We would support the full GeoJSON standard
    Cons:
  • Adds two more complex options for users (and or online editing tools) that capture the same geometry information as feature.
  • Users may gravitate to the word 'geometry' more than 'feature' causing loss of metadata that would otherwise be obtainable.
  • User are not forced to provide supplemental information for features, it could be omitted if unknown.

citation 2.0

Changes:

  1. “date” change type from type date to datetime. See issue #116.
  2. “editionDate” is/was in citation but not supported in mdJson 1.0. I did not see a need to include it in mdJson 2.0 since we have the above date array.
  3. “identifier” combined the RS_Identifier and MD_Identifier forms. For details see the identifier 2.0 (issue #49).
  4. “series” is an object of type CI_Series which is/was in citation but not supported in mdJson 1.0. Definition: “Information about the series, or aggregate resource, to which a resource belongs”. I added the attribute to mdJson 2.0 and described the block in the series 2.0 (issue #53).
  5. “otherCitationDetails” an array of type string and was in -2 CI_Citation as a scalar but was not supported in mdJson 1.0. Definition “other information required to complete the citation that is not recorded elsewhere.” I added the attribute to mdJson 2.0.
  6. "ISBN" is a scalar text field. We did not include this in mdJson 1.0 or 2.0. I can be handled in identifier if necessary.
  7. "ISSN" is a scalar text field. We did not include this in mdJson 1.0 or 2.0. I can be handled in identifier if necessary.
  8. “graphic” is an array of type MD_BrowseGraphic and was added to citation. It is described in "browseGraphic 2.0" (issue #45).
  9. "alternateTitle" is an array of type string. This was a scalar in mdJson 1.0 even though "alternateTitle" was an array in -2.
{
   "citation": [
      {
         "title": "",
         "alternateTitle": [""],
         "date": [
            {
               "see": "date 2.0"
            }
         ],
         "edition": "",
         "responsibleParty": [
            {
               "see": "responsibility 2.0"
            }
         ],
         "presentationForm": [""],
         "identifier": [
            {
               "see": "identifier 2.0"
            }
         ],
         "series": {
            "see": "series 2.0"
         },
         "otherCitationDetails": [""],
         "onlineResource": [
            {
               "see": "onlineResource 2.0"
            }
         ],
         "graphic": [
            {
               "see": "graphic 2.0"
            }
         ]
      }
   ]
}

see XML example citation -3.xml

Add version keyword to main schema

Like:

{
  "id": "schema.json#",
  "$schema": "http://json-schema.org/draft-04/schema#",
  "version": "0.5.2"
  "description": "schema for ADIwg JSON metadata",
  ...
}

Encoding identifiers in the schema

This is about how we encode identifiers in JSON to support MD_Identifier in CI_Citation and EX_GeographicDescription. We made the decision to place identifiers in JSON as an attribute of responsible party; so that implied that ‘contact’ is equivalent to the ‘citation for the issuing authority’. I have been building skimpy citations for the authority from organizationName only of contact block, that’s all I can use from contact that fits into citation. This creates a minimal authority citation always with a full citedResponsibleParty. Almost the opposite of opposite of what we need, a good citation with optional citedResponsibleParty.

Also, if I ask the 'class_identifier' to build the MD_Identifier from responsibleParty I will always get the skimpy authority citation with full citedResponsibleParty when authority is optional in ISO. CitedResponsibleParty is also optional and I can's shut that one down either. Not going to be funny when we build separate extents for geometries with supplemental information regarding identifiers.

And we also have our previously discussed problem of repeating the long citedResponsibleParty section in metadata record's citation.

I think we need to revisit the idea of having the resourceIdentifier block embedded in responsibleParty. I could even push the idea farther and lobby for an authority array ahead of the metadata block, similar to the contacts array. When we are coding identifiers for multi-point, line, polygon geometries the authorities will become highly reused. We could then just provide the authorityID, roleCode, contactID (optional), identifierName, and identifier with each geometry.

identifier 2.0

The new MD_Identifier combines the MD_Identifier and RS_Identifier (i.e. used in imageQuality and other places in -2). I have combined the two so only a single form is supported in mdJson 2.0.

Changes:

  1. “identifier” (or “code” in ISO) is a scalar of type character string. Definition: “alphanumeric value identifying an instance in the namespace.”
  2. “namespace” (or “codeSpace” in ISO) is a scalar of type character string. Definition: “identifier or namespace in which the code is valid.”
  3. “version” is a scalar of type character string. Definition: “version identifier for the namespace”.
  4. “description” is a scalar of type character string. Definition: “natural language description of the meaning of the code value.” We can use this attribute to replace the “type” attribute in mdJson 1.0.
  5. “authority” is an abbreviated citation object. Definition: “the person or party responsible for maintenance of that namespace.” I limited the number of citation attributes in mdJson 2.0 (as we did in 1.0). Issue #54.
  6. "type" element was dropped in mdJson 1.0. It was internal only to mdJson and had no output in -2 or -1. "description" is adequate to handle "type" if necessary.
{
   "identifier": "",
   "namespace": "",
   "version": "",
   "description": "",
   "authority": {
      "see": "citation 2.0"
   }
}

see XML example citation -3.xml

series 2.0

CI_Series was part of ISO 19115-2 but not included in mdJson 1.0. I have added it to mdJson 2.0

Changes:

  1. “seriesName” is a scalar of type character string. Definition: “name of the series, or aggregate resource, of which the resource is a part.”
  2. “seriesIssue” is a scalar of type character string. Definition: “information identifying the issue of the series.”
  3. “page” is a scalar of type character string. Definition: “details on which pages of the publication the article was published.”
            "series": {
               "seriesName": "",
               "seriesIssue": "",
               "issuePage": ""
            },

see XML example citation -3.xml

Revision to the distributionInfo schema

Here's a suggested revision to the distribution schema. It brings in additional attributes and supports a more logical structure by linking the distribution options (online/offline) with a format declaration.

The revised structure is support in both -2 and -1 with the follow exception:

  • -2 supports only one instance of 'offline'

The following element is not supported in -2:

  • description

In -2 mediumType is selected from a codeList (which ADIwg extended); in -1 it is citation. I did not make it a citation in this revision, the code can be pushed into a minimal citation for -1.

{
    "distributionInfo": [
        {
            "description": "",
            "distributor": [
                {
                    "distributorContact": {
                        "contactId": "",
                        "role": ""
                    },
                    "orderProcess": [
                        {
                            "fees": "",
                            "plannedAvailabilityDateTime": "0000-00-00",
                            "orderingInstructions": "",
                            "turnaround": ""
                        }
                    ],
                    "transferOptions": [
                        {
                            "distributionFormat": [
                                {
                                    "formatName": "",
                                    "version": "",
                                    "compressionMethod": ""
                                }
                            ],
                            "transferSize": 0.0,
                            "transferSizeUnits": "",
                            "online": [
                                {
                                    "uri": "http://thisisanexample.com",
                                    "protocol": "",
                                    "name": "",
                                    "description": "",
                                    "function": ""
                                }
                            ],
                            "offline": [
                                {
                                    "mediumType": "",
                                    "mediumCapacity": 0.0,
                                    "mediumCapacityUnits": "",
                                    "mediumFormat": "",
                                    "mediumNote": ""
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}

contact 2.0

Changes:

  1. "contactId" is a required element of type string. It was added by ADIwg to support referencing contacts from elsewhere in the metadata record.
  2. “isOrganization” is a required element of type Boolean. It was added by ADIwg to specify whether the contact is for an individual or organization. The default value is false.
  3. “name” is a required element of type string. It replaces both “organizationName” and “individualName” in mdJson 1.0 since only one name is allowed for a contact now.
  4. “positionName” will only be valid for individual contacts.
  5. “memberOfOrganization” array was added to hold organization “contactId”s that this individual has some relationship to. Will only be valid for individual contacts.
  6. “logoGraphic” array was added for organizations to provide their access to their logos. logoGraphic uses the BrowseGraphic object which has changed from previous ISO format. Will only be valid for organization contacts. Issue #45.
  7. “address” is now an array of addresses. Address format has not changed. Issue #115.
  8. “onlineResource” does not include ISO attributes “applicationProfile” and “protocolRequest” for contact. Issue #50.
  9. “phone” is the new name for “phonebook”. Issue #114.
  10. “hoursOfService” was added as an array of character strings. note: This field is also supported in FGDC.
  11. “contactType” was added as a user defined contact type.
  12. "electronicMailAddress" is an array of strings. It was moved to the base of 'contact' from 'address'.
{
   "contact": [
      {
         "contactId": "",
         "isOrganization": false,
         "name": "",
         "positionName": "",
         "memberOfOrganization": [
            "organization contact ID"
         ],
         "logoGraphic": [
            {
               "see": "graphic 2.0"
            }
         ],
         "phone": [
            {
               "see": "phone 2.0"
            }
         ],
         "address": [
            {
               "see": "address 2.0"
            }
         ],
         "electronicMailAddress": [""],
         "onlineResource": [
            {
               "see": "onlineResource 2.0"
            }
         ],
         "hoursOfService": [""],
         "contactInstructions": "",
         "contactType": ""
      }
   ]
}

No ISO XML example

onlineResource 2.0

Changes:

  1. “protocolRequest” was added to onlineResource but not used in mdJson 2.0. Definition: “request used to access the resource depending on the protocol (to be used mainly for POST requests).” No changes from mdJson 1.0.
            "onlineResource": [
                {
                    "uri": "http://thisisanexample.com",
                    "protocol": "",
                    "name": "",
                    "description": "",
                    "function": ""
                }
            ],

see XML example responsibleParty -3.xml

Is dataDictionary an array?

We have a difference between implementation and schema example.

In the template dataDictionary is an object
In the reader handled as an object
In the example.json an array
In ISO 19115-1 and array (most likely). MD_Metadata can have many md_ContentInformation which can be of subtype MD_FeatureCatalogue which can have many FC_FeatureCatalogue.

So, do we want multiple data dictionaries? If so we need patch a few items.

metadataCharacterSet - do we need it?

At this point all metadata records are written in utf8. I hard coded the writer to set the appropriate ISO attribute with this value. Since it is always 'utf8' and we have no present capability to output the metadata in a different character encoding system, do we really need to specify the character set in as a variable?

Part 2: Do we need to add the ability to write in multiple character sets?

Part 3: Do we want to stub this feature into version 0.5.0 or wait until a future version?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.