Git Product home page Git Product logo

extended-data-model's Introduction

extended-data-model

extended-data-model's People

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

extended-data-model's Issues

Search by dataset name

Add datasetName to the available search fields

Context:
It seems dataset name was previously available on the Advanced Search for Biocache Hub but was not searching across the field of the same name. Eventually, it seems that the field was removed from advanced search instead of fixing the field it uses to perform the search.

Q: Was there any technical issue to allow search by dataset name?

Source Biocollect data sets

The initial analysis in #23 already includes some sample datasets with event information.

This activity will expand on the initial work to source exemplar datasets from Biocollect.

Download service implementation for event data

Exploration required which should include:

  • Investigate use of Spark QL to support download service
  • Investigate connector between Spark and Elastic (Elastic SQL) for reading Elastic search from Spark to produce exports. See this

4 potential types of download we could support, each with different complexities in implementation.

a) Single dataset download
These would be full exports of the event datasets with our interpretation (taxonomy etc).
These could be pre-generated using pipelines (similar to DwCA export pipeline) and copied to S3 or FS.
These would satisfy the EcoCommons people.
Complexity: LOW

b) Multiple dataset download
Similar to the above, but the ability to package multiple complete datasets (a zip of zips).
Complexity:MEDIUM

c) Query based cross dataset download
This would be the sort of download we are familiar with for occurrence data, but i question whether it is a good idea for event data, where the datasets are all quite different.
If AVRO based, then events need (globally) unique eventIDs which is something we dont have at the moment.
Complexity: HIGH

d) Sites by species download
Elastic search based, using facets
Complexity: MEDIUM

General UI feedback

  1. Filters at the top look similar to tabs.
    Tabs design should be more tradditional to make it clear what they are.

  2. There should be some text indicating that no filters are applied when displaying all records.
    "Currently viewing all datasets apply some filters to narrow down the results."

  3. A basic inline tutorial should be great but the second option is to produce documentation to explain the basic navigation starting from a dataset.

Screen Shot 2022-08-25 at 4.58.34 pm.png

  1. Pagination controls are too small and are difficult to spot
    Screen Shot 2022-08-25 at 5 29 51 pm

  2. Map view should use similar layers as Biocache and BIE
    Screen Shot 2022-08-25 at 5 33 56 pm

Generate DOIs for downloads

  • Add metadata to DOI similar to metadata available for Occurrence Download DOIs

Question: Are we going to implement a fallback mechanism similar to Biocache when DOI service is unavailable or otherwise how are we going to manage errors?

Search by all event fields - Event search tab

  1. Add additional search option at the top of event search tab.
    Search will perform a partial match across all event Id, parent event id, field number and dataset name.
    The rational is that data across different datasets will use different fields for the same information, for example one dataset uses event id while the other uses field number.

  2. For consistency with other elements of the user interface, rename dataset names to dataset / survey names.

Note: labels on UI must use i18n conventions.

See below for details:

Screen Shot 2022-06-06 at 3 22 47 pm

Add event fields to facets

Event fields need to be added to facets on biocache hubs

ParentEventId and field number are fields available in the index but not in the UI

Screen Shot 2022-02-08 at 2 48 40 pm

Standup UI components on a AWS VM

Required, in order

  • Copy GBIF react-components to ALA repo
  • Add an ala-graphql-api package
    • Add ES backed Event resource
  • Add Event and EventSearch react-components
  • Add event-ui package using the Event and EventSearch components
  • Deploy on AWS

Optional

  • Add graphql SOLR backed Occurrence resource and react-components to search and view
  • Add graphql downloads-service resource and react-components to submit requests
  • Add graphql Collections resource and react-components to search and view
  • Add graphql BIE resource and react-components to search and view
  • Add graphql Images resource and react-components to view
  • Add graphql Lists resource and react-components to view

Pagination broken

Pagination broken on site tab (Devonport Tasmania insect dataset)

Steps to reproduce

Go to Datasets scroll down to "Catches of numerous insect species in Rothamsted 160W light trap at Devonport, Tasmania, 1992-2019" dataset and click Add to filter.
Go to Sites tab
Click on next page ">" button

Expected
Next page will be displayed

Actual
No site data available message is displayed despite having 3 pages of site information.

Facets and record view changes for event data

This issue takes an opinionated approach to move some fields around in the Biocache Hub to make more intuitive for users to find event information.

Facets

  1. Add Event section to facets just below Occurrence

Screen Shot 2022-03-22 at 9 32 42 am

For the filters configuration:

  1. Create new event section.

  2. Move Month, Year, Date Precision, Year (by decade) and Event ID from Occurrence Section to new Event section

  3. Move Dataset name from Attribution to new Event section

  4. Rename " Dataset name" to "Dataset /Survey name"

  5. Add Parent Event ID and Field number fields to facets

  6. Proposed order of fields:

Left Right
Dataset /Survey name Month
Parent Event ID Year
Field number Date Precision
Event ID Year (by decade)

Screen Shot 2022-03-22 at 9 56 22 am

Record view page

  1. Move Field number and Dataset / Survey name from Dataset section to Event section
  2. Add Event ID and Parent Event ID
  3. The proposed order of fields in the Event section is:
  • Dataset / Survey name
  • Parent Event ID
  • Field number
  • Event ID
  • Existing list of event fields

Screen Shot 2022-03-22 at 9 37 24 am

Screen Shot 2022-03-22 at 9 36 14 am

Source exemplar datasets

Seedbank it is a good source of datasets that attach measurements. This dataset also is a good example showing that we can't make the assumption that occurrence is always the last element at the bottom of the hierarchy.

BIE integration

  • View list of events for this taxon
  • View map of events for this taxon

Use a Vocabulary Service

Current Data been mapped to Event DwCA, such as Reef ... dataset, includes sampling method with typical values of 1 or 2 that lack any meaning unless the context information about the method is added.

We need the ability to define resolvable URIs in the dataset metadata with context information about the sampling method (any other field or term?).

We need to make use of a vocabulary service as the target for the proposed URIs above and actively maintain those vocabularies.

Prospect services are https://vocabs.ardc.edu.au/ or the vocabulary service managed by GBIF (tbc).

Search by Event Id

Add "Event search" tab option to biocache search in order to allow the search by Event Ids/Field number

Screen Shot 2022-02-08 at 12.04.26 pm.png

  • Should Event search be a copycat of catalogue search?
  • What fields the search should occur on: Event ID, Field number, Parent Event ID?

Interpret EstablishmentMeans as a String, not a JSON

Originally reported by @nielsklazenga

https://biocache-ws-databox.ala.org.au/ws/occurrences/3afe9be0-f4fd-41d7-b6df-2a5117e08757
Valued of processed->occurrence->establishmentMeans is an json: 'establishmentMeans: "{"concept": "vagrant", "lineage": ["vagrant"]}"'

Our current EstablishmentMean model has two fields: concept and lineage, but EstablishmentMean in the latest DWC core is defined as a controlled value string
refer: https://dwc.tdwg.org/em/

It has the same issue on our Biocache prod

Todo:
Working on
https://github.com/gbif/pipelines/blob/dev/livingatlas/pipelines/src/main/java/au/org/ala/pipelines/transforms/ALABasicTransform.java#L112

Link to: AtlasOfLivingAustralia/la-pipelines#525

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.