Git Product home page Git Product logo

dmp-scripts's Introduction

Introduction

Full instructions on how to query the DMPonline API can be found at: https://github.com/DMPRoadmap/roadmap/wiki/API-documentation

Install required python modules

pip install -r requirements.txt

General info on API usage

In order to use these scripts a .env file is required where login information and URLs are stored. Use the .envexample file to create your own.

Swecris has an open API key (can be found at: https://www.vr.se/english/swecris/swecris-api.html).

DMPonline administrators need to state their login and API-key in the .env file in order to be able to authenticate with the DMPonline API.

Query DMPonline about existing templates

The script dmponline_templates.py queries DMPonline about existing templates. Useful to identify specific templateids.

At the moment the script performs a simple query to DMPonline using login info from the .env file and downloads all accessible templates, prints them and stores them as a single JSON in a subfolder, Templates.

Currently, the script only returns the first page (up to 100 entries).

Download a specific DMP from DMPonline

The script dmponline2_file_v0.pyand dmponline2_file_v0.pyboth lookup and donwload a specified DMP and stores it as a JSON. The v0-script accesses the DMPonline API V0 while the v1-script aceesses the API V1. For details see: https://github.com/DMPRoadmap/roadmap/wiki/API-documentation

The main difference is that API V0 relays all the information stored in the DMP, while API V1 provides the metadata on the DMP in a RDA v1.0 schema compliant format.

Downloaded plans are stored in a subfolder, Downloaded_plans.

Both scripts require input in the form of the ID of the specific plan. These are the 6 digits at the end of the DMP-URL (e.g. dmponline.dcc.ac.uk/plans/123456)

Example call: ./python3 dmponline2_file_v0.py -i 123456

Create a single DMP in DMPonline using data from SweCris

The script swecris_to_dmponline.py fetches data from SweCris for a given project/financed activity and genereates a basic DMP that can be uploaded to DMPonline.

The script takes the following as input:

  • Grantid, e.g. -i 2023-xxxxx (required, this is the grant id from the funder)
  • Funder e.g. -f vr (required, necessary to locate data in SweCris)
  • Language e.g. -l eng (optional, default is eng)
  • Name e.g. -n "Albert Einstein" (required, used to create the user)
  • Email e.g. -e [email protected] (required, used to create the user)
  • Orcid e.g. -0 0009-1234-5678-1234 (optional, currently disabled)
  • Templatenumber e.g -t 439 (required, this can be found by running a template check with dmponline_templates.py)

Example call: ./python3 swecris_to_dmponline.py -i 2012-12345 -f vr -n "Albert Einstein" -e [email protected] -t 439

The script first tries to access the SweCris database to find the correct project. If found, it prompts the user for whether to create a DMP.

If yes then:

The script reorders data from SweCris into a json file compatible with the RDA JSONschema 1.0 (see: https://github.com/RDA-DMP-Common/RDA-DMP-Common-Standard).

The schema consists of the following sections (click me):
Syntax Description
"dmp:" main container/dictionary where additional containers are added. subheadings include:
"schema:" cannot be changed. default is 1.0.
"title:" Fetched from SweCris. This is the title of the research project.
"description:" Fetched from SweCris. This is the abstract for the research project.
"language:" default eng. Can be changed?
"created:" added by DMPonline. Anything written here will be overwritten with a timestamp from the system.
"modified:" added by DMPonline. Anything written here will be overwritten with a timestamp from the system.
"ethical_issues_exist:" default unknown
"dmp_id:" container created by DMPonline. subheadings include:
    "type:"
default url
    "identifier:"
this is the direct url to the plan. e.g. "https://dmponline.dcc.ac.uk/api/v1/plans/123456". The beginning of the url can be replaced with an institutional domain adress (e.g. https://dmp.kth.se/)
"contact:" container for the contact/owner of the plan. subheadings include:
    "name:"
Fetched from script params. But DMPonline will change this if the email exists in its system
    "mbox:"
e-mail address from script params. This is checked in DMPonline internally to fetch additional data
    "affiliation:"
container with two subheadings:
    "name:"
Institutional name, fetched from .env
    "abbreviation:"
Institutional abbreviation. Fetched from .env
    "contact_id:"
optional container, created from script params if included. Autocreated by DMPonline if user and ORCID exists. Two subheadings:
    "type:"
default orcid
    "identifier:"
orcid. id-format: https://orcid.org/0000-0001-2345-6789
"contributor:" container for the contributors to the plan, several can be added. DMPonline adds contact as an additional contributor here even if not included in SweCris. Subheadings include:
    "name:"
Fetched from SweCris.
    "mbox:"
E-mail. Not in Swecris and thus not included in data sent to DMPonline, but this is sometimes added by DMPonline if user exists.
    "role:"
default other. However DMPonline sometimes changes this to CRediT roles (e.g. http://credit.niso.org/contributor-roles//data-curation). Unclear why and based on what.
    "affiliation:"
container with two subheadings:
    "name:"
Institutional name, fetched from .env
    "abbreviation:"
Institutional abbreviation. Fetched from .env
    "contributor_id:"
optional container, created from SweCris data if included. Autocreated by DMPonline if user and ORCID exists. Problematic if user exists without orcid in DMPonline but orcid exists in SweCris. Two subheadings:
    "type:"
default orcid
    "identifier:"
orcid. id-format: https://orcid.org/0000-0001-2345-6789
"project:" container for the project. Subheadings include:
    "title:"
Fetched from SweCris. Needs to be identical to the DMP title.
    "description:"
Fetched from SweCris. Needs to be identical to the DMP description.
    "start:"
Fetched from SweCris.
    "end:"
Fetched from SweCris.
    "funding:"
Container for funder information.
    "name:"
Funder name, from script params
    "funder_id:"
container with 2 subheadings. Created based on script params
    "type:"
default ror
    "identifier:"
ror. id-format: https://ror.org/03zttfo63 PROBLEM: DMPonline changes correct rors to dummy ones (https://ror.org/123abc45y)
    "grant_id:"
container with two subheadings
    "identifier:"
grant number, genereated from script params. NOTE: this can only be used once and needs to be unique otherwise ignored by DMPonline.
    "type:"
default other
    "funding_status:"
default granted PROBLEM: DMPonline changes to planned
    "dmproadmap_funded_affiliations:"
container added by DMPonline. Two subheadings
    "name:"
Institutional name.
    "abbreviation:"
Institutional abbreviation.
"dataset:" container for an empty dataset. Subheadings include:
    "type:"
default dataset
    "title:"
default Generic dataset
    "description:"
default No individual datasets have been defined for this DMP.
"extension:" container for template definition. Subheadings include:
    "dmproadmap:"
subcontainer
    "template:"
subcontainer
    "id:"
Fetched from script params. id number for the template.
    "title:"
default "". Gets filled in by DMPonline with correct title based on id.

The complete JSON is shown and the script prompts whether to upload to DMPonline.

If yes then the script uploads the plan to dmponline. The postdata is printed in full and also stored as a json file combining grantid and name (from script parameters) in a subfolder Uploaded_plans. The link to the dmp is printed and then the script exits.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.