Git Product home page Git Product logo

vocabularies's Introduction

GSQ Vocabularies

Introduction

The Geological Survey of Queensland (GSQ) publishes vocabularies - a way to describe things and the relationship between things.

A vocabulary is a set of agreed terms:

  • In GSQ, a vocabulary defines the terms used to describe and represent things in the domain of science and data management.
  • Vocabularies align information within a business area or across systems.
  • Vocabularies can be very complex (with thousands of terms) or very simple (describing one or two concepts only).

Read Why Vocabularies? and more subjects in the Vocabularies Wiki.

Vocabulary - how it all hangs together

Vocabulary context diagram

Fig. 1: Vocabulary context diagram

  1. We use tools such as Vocbench or Excel to create the vocabulary using SKOS Simple Knowledge Organization System. See also the SKOS Primer for the basics.
  2. The native format for a vocabulary is a TTL (turtle) file. This file contains RDF triples - subject > predicate > object statements.
  3. We use Github (where you are now) to store and manage versions of vocabulary TTL files. Github also provides workflow functionality to approve vocabularies. Read the Github getting started guide
  4. We import the TTL files into GraphDB to create a triple store. GraphDB lets us query the triples.
  5. VocPrez presents our vocabs on the web for people and computers to read. VocPrez pulls the triples from GraphDB to create a cache of the vocabularies.
  6. CKAN drop-down form fields pull their values from VocPrez. This ensures that the attributes uses to describe a dataset comes from the controlled vocabulary.

How to create a vocabulary

Vocabulary build and pull process

Fig. 2: Vocabulary build and pull process

  1. Select the vocabulary editor of your choice.
  2. Create the vocabulary using the SKOS Simple Knowledge Organization System. See also the SKOS Primer for the basics. NOTE: Always first check if there is an international or national vocabulary (see below for links).
    a. Use Vocbench to create the vocabulary.
    b. Use the Excel template to create the vocabulary - download Excel SKOS Vocabulary Builder.
    c. Edit the vocab TTL file in Visual Studio Code. Use the extension Language Support for RDF related language syntax for formatting support.
  3. Export the vocabulary to a TTL file. If using Vocbench, it is easier to export the TTL from the Build repository in GraphDB. Follow the instructions here.
  4. Validate the TTL file using the online Skosify tool. Tick the checkboxes Keep skos:related relationships within the same hierarchy and Include skos:narrower relations in output
  5. Import the TTL file into a development branch in Github. Name your branch dev-yourGithubusername. See how-to instructions here.
  6. When you're ready to publish your vocabulary into Test, submit a Pull Request to the TEST branch. See how-to instructions here.
  7. A member of the Data Integrity Team will review your vocabulary and either Approve or Request Changes. See how-to instructions here.
  8. If the pull request is approved, the vocab will now be in the TEST branch.

How to publish a vocabulary to VocPrez Test

  1. Import the vocabulary TTL file into the Core Repository in the Test Graph DB https://test.graphdb.gsq.digital using the instructions here.
  2. Restart the Test VocPrez to refresh the VocPrez cache (we will automate this step).
  3. The vocabulary is now published in the Test VocPrez at https://test.vocabs.gsq.digital

How to review and validate a vocabulary

See the instructions at Vocabulary Review Process

How to publish a vocabulary to VocPrez Production

  1. Follow the PID URI Allocations process detailed on the Linked Data Working Group webpage.
  2. Perform a Pull Request from the DEV branch in Github to the MASTER branch.
  3. A member of the Data Integrity Team will review your vocabulary and either Approve or Request Changes.
  4. Import the vocabulary TTL file into the Core Repository in the Production Graph DB https://graphdb.gsq.digital using the instructions here.
  5. Restart the Production VocPrez to refresh the VocPrez cache (we will automate this step).
  6. The vocabulary is now published in the Production VocPrez at https://vocabs.gsq.digital. Please note that the vocab will not display in VocPrez until the URI registration at http://linked.data.gov.au is approved.

See also

Files

  • ontologies/*.ttl - background ontologies needed for vocab inferencing
  • gsq-*.ttl - GSQ vocab files
  • vocabs_load.py - a Python script to load a GraphDB instance with the background ontologies and GSQ vocab files
  • scripts/ - Python scripts to dump/load a GraphDB instance with these vocab files

License

This code repository's content are licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0), the deed of which is stored in this repository here: LICENSE.

Contacts

Vocabularies owner:
Mark Gordon
Geological Survey of Quensland
Department of Natural Resources, Mines and Energy
Brisbane, QLD, Australia
[email protected]

Technical contact:
Vance Kelly
Geological Survey of Quensland
Department of Natural Resources, Mines and Energy
Brisbane, QLD, Australia
[email protected]

Author:
David Crosswell
Enterprise Architect
Cross-Lateral Enterprises
https://crosslateral.com.au

vocabularies's People

Contributors

kellyvance avatar davidcrosswellgsq avatar dxwell avatar lukehauck avatar nicholascar avatar johnmckellar avatar gsq-ai avatar greenwoodmatthew avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.