rda-dmp-common / hackathon-2020 Goto Github PK

View Code? Open in Web Editor NEW

6.0 6.0 6.0 107 KB

RDA hackathon on maDMPs

License: The Unlicense

hackathon-2020's People

Contributors

Stargazers

Watchers

Forkers

leucoryx briri froggypaule hmpf kuba112404 ord-hackathon

hackathon-2020's Issues

Joint publication about results of the hackathon

It would be great if we can write a publication together to present our results for sharing and referencing...

maDMP import to DS Wizard

Goal: Be able to import maDMPs to Data Stewardship Wizard as questionnaires

We will work on this, when ready and happy with #7. This task requires more planning and probably may result in some PoC rather then in-DSW implementation.

Plan

Explore possibilities how to import maDMP “as starter” questionnaire
Create a prototype of "importer" feature/service
Import provided examples of maDMPs (#2)
Import using export from DMPonline (and possibly other tools)

Repository for maDMPs (exposing maDMPs)

maDMPs can be used to exchange information between two systems, but they can also be published.

We need a repository for maDMPs that replaces existing PDF-based repositories.

The repository should allow to restrict visibility of specific parts of the maDMP. For example, information on Costs are not publically available – nobody should be allowed to see them.

Users should be able to search for relevant maDMPs and the system should display a list of relevant maDMPs, e.g.

maDMPs that have the same contact person
maDMPs using the same data
maDMPs using the same repository
maDMPs following the same metadata standard
maDMPs modified after certain date
maDMPs with ethical issues, requiring closed access, with embargo, etc…
….

Have EasyDMP support the data types of RDA DMP Common Standard

EasyDMP is based around a question/answer format, and each question has a type, for instance single-choice, multiple-choice, yes/no, date range, url, or typed value fetched from external API via the EEStore. A set of questions are collected into a "section", which is collected into a "template". The answers per plan are stored as JSON and is perfectly machine readable but since there is no ontology or vocabulary it is not machine actionable. In addition a free text version is generated for use as an attachment to funding applications.

It'd be handy to have types for: quota (number plus type, say), email address, cost (currency, value), keywords, language, controlled vocabulary locally stored etc., many possiblities.

This should be relatively easy for a newcomer to EasyDMP.

Application of maDMP scheme at the institution RDM workflow

Since the main funding agencies now require Data Management Plans (DMP) during grant applications, the Institute for Systems and Computer Engineering, Technology and Science (INESC TEC)* started to help researchers with DMP creation. The plans are created using collaborative method between data steward and researchers that includes several kinds of activity, such as interviews, analysis of the publications, data related to the project and DMP examples existing in their scientific domain. After the preliminary work by the data steward, the first version of the plan is created and presented to researchers for refinement, corrections, and ultimately completion of a final version. This method simplifies DMP creation, can be applied in different domains, creates DMPs with more details, but requires improvements.
Thus, during the Hackathon, the work of our group will be focused on the analysis of the existed DMPs created at INESC TEC according to maDMP scheme. We will juxtapose of our method and maDMP concept to identify what we need to change (add, delete, edit) in our DMPs to make them machine-actionable. In other words, we will identify requirements to improve INESC TEC RDM Workflow and make our DMP method conforming to maDMP concept.

I proposed our institution INESC TEC, bu INESC TEC can be substituted by other institution

Add dataset support to EasyDMP

RDA DMP Common Standard has a notion of datasets, one of which is obligatory, but multiple are supported. EasyDMP is oriented around a "plan" made from answers to typed questions, which is converted to a free text form, with no notion of datasets.

This is a complex redesign/refactor job, which I will use as a background/fallback task, and I don't expect any help :)

Integrate project data management process into maDMP

Many projects also have data management workflows that are either defined in some 'standard' (like Common Workflow Language) or in less standard, but perhaps community-wide manner. What would be good is to understand how we can incorporate this element of data management into the plan.

Example of data management workflow could be: read in data file, transform data objects into physical quantities, perform some analysis of those quantities and output the quantities to a new file.

I think one thing we could have I guess is if the workflow has a DOI then it can be referenced in the plan, but it's not clear to me where and maybe also how. The workflow itself may take some time to complete.

New maDMP examples

Here is a collection of DMPs and maDMPs:
https://zenodo.org/communities/tuw-dmps-ds-2020/

Let's review them and create some examples that all of us can use for testing our implementation of the standard.

OpenAIRE Research Graph and maDMPs

It would be great to investigate exchange of information between the OpenAIRE Research Graph and maDMPs.

For example, the research graph has information on projects and data produced in the project. We could use this information to generate an maDMP that is later submitted to funder. This can also work in the opposite direction: maDMP created using a dmp tool contains already information on data that is reused for a research project and data that was generated. This can be upload to the knowledge graph.

Example of a result from the KG: http://api.openaire.eu/search/datasets?projectID=777541

KG: https://www.openaire.eu/blogs/the-openaire-research-graph
https://zenodo.org/communities/openaire-research-graph?page=1&size=20

maDMP in Linked Data Pipeline as part of Research Data Connectome.

information to be followed...

Controlled vocabularies for maDMPs

The current specification of the standard has many fields defined as Strings, because there was no standard vocabulary to be used by the whole community. For example, Dataset\type can be set to any String value. This is because there are different vocabularies with the RDM community. For example, DataCite and COAR define vocabularies of types of datasets.

There is a need for a group that would analyse maDMP specification in view of fields for which establishing a common vocabulary would be needed and would be possible. This can result in developments after the hackathon - maybe an RDA WG should be established for that purpose? Dataset\type is just an example of alignment needed.

Integrating our University instance of the DMPonline platform with other systems

Integrating our CRIS/RIMS (Converis) with our institutional DMP platform, to make things more streamlined for researchers and admins alike.

Implementation of publishing of (selected) UCT DMPs through DataCite, who already mint the UCT-branded dois for the scholarly outputs published on UCTs data repository ZivaHub, running on Figshare for Institutions.

maDMP export from DS Wizard

Goal: Be able to export maDMPs in RDF from Data Stewardship Wizard

Plan

Content part
- Find mapping using existing questions in common KM
- Add relevant questions if necessary
- Examples of maDMPs in DS Wizard
Programming part
- Create export template to JSON
- Consider possibilities of using DCSO for RDF export
  - Allow export RDF in various formats using conversion (ttl, n3, rdf/xml, etc.)
Additional (nice-to-have):
- Consider making “abstract” KM as starter with only maDMP-related questions
- Use document submission feature to send maDMP to other service

Import/Export maDMPs from Argos/OpenDMP

Work on extending and refining the baseline mechanism of OpenDMP software for importing and exporting maDMPs.
Validate the alignment with the models of other tools as those are represented in the current maDMP specicfication.

Domain specific extension for maDMPs

I think in the domain ABC we need to include more information on Security and Privacy. I have a collection of DMPs and would like to define an extension to the standard by defining additional fields to reflect the needs of domain ABC.

[this is an example]

New version of the DMP Common Standard Ontology

As part of the ongoing effort to have different formats to represent the DMP Common Standard, I'm looking for help in creating a new version of the DCSO.

There are four main points of action:

1. Use some means (SHACL, ShEx or some other option) to represent the constraints in the DMP Common Standard.

2. Integrate the DCAT and DublinCore ontologies into the existing DCSO, thus reusing classes (and properties) as opposed to the current practice of redefining classes.

3. How to represent the custom controlled vocabularies required for some of the existing fields, in a way that they allow for validation (i.e., the usage of iso-3166-1-alpha2 in the geo_location property in the Host class).

4. Provide the DCSO with a purl, thus solving the current namespace issue. Which is unsuitable for long term preservation and reuse.

Edit:
I'm currently solving issue 4. Following the advice of robertgiessmann. Thanks!

Edit2:
Issue 4 solved. https://w3id.org/dcso Thanks.

Export/Import maDMP from Figshare

First step: extract maDMP from figshare repository using the figshare API https://docs.figshare.com/

Second step: test importing of an maDMP into figshare

Mapping of maDMPs to funder templates

We have already some examples [1] [2] [3] on how maDMP can be mapped to Science Europe or Horizon 2020 DMP templates. There is a need for a group that would review existing mappings and would come up with a single mapping. Small extensions to the maDMP standard may also be needed. A perfect outcome for the hackathon would be a common mapping for one of the popular research funders. New mappings, e.g. to NSF, are also welcomed!

[1] https://doi.org/10.5281/zenodo.3727720
[2] https://doi.org/10.5281/zenodo.3727714
[3] https://doi.org/10.5281/zenodo.3727724

A collection of DMPs and maDMPs can also be found here:
https://zenodo.org/communities/tuw-dmps-ds-2020/

Export maDMP from tool X into tool Y

I would like to export information from tool X, that I maintain, into the tool Y. I think that this could help us in ABC. Let's see how much information we can exchange.

(This is just an example)

Link a DSW DMP with a FIP

The FIP (FAIR Implementation Profile, which is the output of a filled in FAIR Matrix questionnaire, found at https://fair-matrix.ds-wizard.org) can be seen as the DNA of a DMP. I would like to investigate how to logically and technically link the two.

Export maDMP from RDMO

The Research Data Management Organiser (RDMO) is tool to organize information about data management and create DMPs. Our GitHub orga is https://github.com/rdmorganizer.

Internally, RDMO already uses a vocabulary to abstract our questionnaire(s) from the user input. At the hackday, we want to map this internal vocabulary (we call it domain) to maDMP and create an export functionality. The core team of RDMO (@jochenklar, @triole, @leucoryx) will participate and we would be happy if other people join us.

An import of maDMP data into RDMO will be a follow-up project.

Integration of maDMPs with Repositories/InvenioRDM

From @iliremavriqi @mb-wali @freelion93

Specifying the reusable metadata that can be filled automatically in both ways (InvenioRDM <=> maDMP )
- Comparing data models of InvenioRDM and maDMP
- Finding all possible fields that can be automatically reused-filled (e.g. project name, cost, DOI, etc)
- Define an API call on both platforms (InvenioRDM / maDMP-tool) for gathering information
Fetching existing datasets, projects from InvenioRDM
- Implement an interface for the possibility to gather datasets, projects for specific user
- Archiving of the final versions of maDMP in InvenioRDM
Determine the user groups, rights, storage management interface
- Create required schemas, tables,
- Implement the archive option for maDMP

Resolve all existing issues

I would like to resolve all the issues that exist in the DMP Common Standard repository :)

rda-dmp-common / hackathon-2020 Goto Github PK

hackathon-2020's People

Contributors

Stargazers

Watchers

Forkers

hackathon-2020's Issues

Plan

Plan

Recommend Projects

Recommend Topics

Recommend Org