Git Product home page Git Product logo

hackathon-2020's People

Contributors

briri avatar froggypaule avatar hmpf avatar leucoryx avatar tommiksa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hackathon-2020's Issues

maDMP import to DS Wizard

Goal: Be able to import maDMPs to Data Stewardship Wizard as questionnaires

We will work on this, when ready and happy with #7. This task requires more planning and probably may result in some PoC rather then in-DSW implementation.

Plan

  • Explore possibilities how to import maDMP “as starter” questionnaire
  • Create a prototype of "importer" feature/service
  • Import provided examples of maDMPs (#2)
  • Import using export from DMPonline (and possibly other tools)

Repository for maDMPs (exposing maDMPs)

maDMPs can be used to exchange information between two systems, but they can also be published.

We need a repository for maDMPs that replaces existing PDF-based repositories.

The repository should allow to restrict visibility of specific parts of the maDMP. For example, information on Costs are not publically available – nobody should be allowed to see them.

Users should be able to search for relevant maDMPs and the system should display a list of relevant maDMPs, e.g.

  • maDMPs that have the same contact person
  • maDMPs using the same data
  • maDMPs using the same repository
  • maDMPs following the same metadata standard
  • maDMPs modified after certain date
  • maDMPs with ethical issues, requiring closed access, with embargo, etc…
  • ….

Have EasyDMP support the data types of RDA DMP Common Standard

EasyDMP is based around a question/answer format, and each question has a type, for instance single-choice, multiple-choice, yes/no, date range, url, or typed value fetched from external API via the EEStore. A set of questions are collected into a "section", which is collected into a "template". The answers per plan are stored as JSON and is perfectly machine readable but since there is no ontology or vocabulary it is not machine actionable. In addition a free text version is generated for use as an attachment to funding applications.

It'd be handy to have types for: quota (number plus type, say), email address, cost (currency, value), keywords, language, controlled vocabulary locally stored etc., many possiblities.

This should be relatively easy for a newcomer to EasyDMP.

Application of maDMP scheme at the institution RDM workflow

Since the main funding agencies now require Data Management Plans (DMP) during grant applications, the Institute for Systems and Computer Engineering, Technology and Science (INESC TEC)* started to help researchers with DMP creation. The plans are created using collaborative method between data steward and researchers that includes several kinds of activity, such as interviews, analysis of the publications, data related to the project and DMP examples existing in their scientific domain. After the preliminary work by the data steward, the first version of the plan is created and presented to researchers for refinement, corrections, and ultimately completion of a final version. This method simplifies DMP creation, can be applied in different domains, creates DMPs with more details, but requires improvements.
Thus, during the Hackathon, the work of our group will be focused on the analysis of the existed DMPs created at INESC TEC according to maDMP scheme. We will juxtapose of our method and maDMP concept to identify what we need to change (add, delete, edit) in our DMPs to make them machine-actionable. In other words, we will identify requirements to improve INESC TEC RDM Workflow and make our DMP method conforming to maDMP concept.

  • I proposed our institution INESC TEC, bu INESC TEC can be substituted by other institution

Add dataset support to EasyDMP

RDA DMP Common Standard has a notion of datasets, one of which is obligatory, but multiple are supported. EasyDMP is oriented around a "plan" made from answers to typed questions, which is converted to a free text form, with no notion of datasets.

This is a complex redesign/refactor job, which I will use as a background/fallback task, and I don't expect any help :)

Integrate project data management process into maDMP

Many projects also have data management workflows that are either defined in some 'standard' (like Common Workflow Language) or in less standard, but perhaps community-wide manner. What would be good is to understand how we can incorporate this element of data management into the plan.

Example of data management workflow could be: read in data file, transform data objects into physical quantities, perform some analysis of those quantities and output the quantities to a new file.

I think one thing we could have I guess is if the workflow has a DOI then it can be referenced in the plan, but it's not clear to me where and maybe also how. The workflow itself may take some time to complete.

OpenAIRE Research Graph and maDMPs

It would be great to investigate exchange of information between the OpenAIRE Research Graph and maDMPs.

For example, the research graph has information on projects and data produced in the project. We could use this information to generate an maDMP that is later submitted to funder. This can also work in the opposite direction: maDMP created using a dmp tool contains already information on data that is reused for a research project and data that was generated. This can be upload to the knowledge graph.

Example of a result from the KG: http://api.openaire.eu/search/datasets?projectID=777541

KG: https://www.openaire.eu/blogs/the-openaire-research-graph
https://zenodo.org/communities/openaire-research-graph?page=1&size=20

Controlled vocabularies for maDMPs

The current specification of the standard has many fields defined as Strings, because there was no standard vocabulary to be used by the whole community. For example, Dataset\type can be set to any String value. This is because there are different vocabularies with the RDM community. For example, DataCite and COAR define vocabularies of types of datasets.

There is a need for a group that would analyse maDMP specification in view of fields for which establishing a common vocabulary would be needed and would be possible. This can result in developments after the hackathon - maybe an RDA WG should be established for that purpose? Dataset\type is just an example of alignment needed.

Integrating our University instance of the DMPonline platform with other systems

Integrating our CRIS/RIMS (Converis) with our institutional DMP platform, to make things more streamlined for researchers and admins alike.

Implementation of publishing of (selected) UCT DMPs through DataCite, who already mint the UCT-branded dois for the scholarly outputs published on UCTs data repository ZivaHub, running on Figshare for Institutions.

maDMP export from DS Wizard

Goal: Be able to export maDMPs in RDF from Data Stewardship Wizard

Plan

  • Content part
    • Find mapping using existing questions in common KM
    • Add relevant questions if necessary
    • Examples of maDMPs in DS Wizard
  • Programming part
    • Create export template to JSON
    • Consider possibilities of using DCSO for RDF export
      • Allow export RDF in various formats using conversion (ttl, n3, rdf/xml, etc.)
  • Additional (nice-to-have):
    • Consider making “abstract” KM as starter with only maDMP-related questions
    • Use document submission feature to send maDMP to other service

Import/Export maDMPs from Argos/OpenDMP

Work on extending and refining the baseline mechanism of OpenDMP software for importing and exporting maDMPs.
Validate the alignment with the models of other tools as those are represented in the current maDMP specicfication.

Domain specific extension for maDMPs

I think in the domain ABC we need to include more information on Security and Privacy. I have a collection of DMPs and would like to define an extension to the standard by defining additional fields to reflect the needs of domain ABC.

[this is an example]

New version of the DMP Common Standard Ontology

As part of the ongoing effort to have different formats to represent the DMP Common Standard, I'm looking for help in creating a new version of the DCSO.

There are four main points of action:

1. Use some means (SHACL, ShEx or some other option) to represent the constraints in the DMP Common Standard.

2. Integrate the DCAT and DublinCore ontologies into the existing DCSO, thus reusing classes (and properties) as opposed to the current practice of redefining classes.

3. How to represent the custom controlled vocabularies required for some of the existing fields, in a way that they allow for validation (i.e., the usage of iso-3166-1-alpha2 in the geo_location property in the Host class).

4. Provide the DCSO with a purl, thus solving the current namespace issue. Which is unsuitable for long term preservation and reuse.

Edit:
I'm currently solving issue 4. Following the advice of robertgiessmann. Thanks!

Edit2:
Issue 4 solved. https://w3id.org/dcso Thanks.

Mapping of maDMPs to funder templates

We have already some examples [1] [2] [3] on how maDMP can be mapped to Science Europe or Horizon 2020 DMP templates. There is a need for a group that would review existing mappings and would come up with a single mapping. Small extensions to the maDMP standard may also be needed. A perfect outcome for the hackathon would be a common mapping for one of the popular research funders. New mappings, e.g. to NSF, are also welcomed!

[1] https://doi.org/10.5281/zenodo.3727720
[2] https://doi.org/10.5281/zenodo.3727714
[3] https://doi.org/10.5281/zenodo.3727724

A collection of DMPs and maDMPs can also be found here:
https://zenodo.org/communities/tuw-dmps-ds-2020/

Export maDMP from tool X into tool Y

I would like to export information from tool X, that I maintain, into the tool Y. I think that this could help us in ABC. Let's see how much information we can exchange.

(This is just an example)

Export maDMP from RDMO

The Research Data Management Organiser (RDMO) is tool to organize information about data management and create DMPs. Our GitHub orga is https://github.com/rdmorganizer.

Internally, RDMO already uses a vocabulary to abstract our questionnaire(s) from the user input. At the hackday, we want to map this internal vocabulary (we call it domain) to maDMP and create an export functionality. The core team of RDMO (@jochenklar, @triole, @leucoryx) will participate and we would be happy if other people join us.

An import of maDMP data into RDMO will be a follow-up project.

Integration of maDMPs with Repositories/InvenioRDM

From @iliremavriqi @mb-wali @freelion93

  • Specifying the reusable metadata that can be filled automatically in both ways (InvenioRDM <=> maDMP )
    • Comparing data models of InvenioRDM and maDMP
    • Finding all possible fields that can be automatically reused-filled (e.g. project name, cost, DOI, etc)
    • Define an API call on both platforms (InvenioRDM / maDMP-tool) for gathering information
  • Fetching existing datasets, projects from InvenioRDM
    • Implement an interface for the possibility to gather datasets, projects for specific user
    • Archiving of the final versions of maDMP in InvenioRDM
  • Determine the user groups, rights, storage management interface
    • Create required schemas, tables,
    • Implement the archive option for maDMP

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.