CONP dataset

CONP dataset is a repository containing the datasets available in the Canadian Open Neuroscience Platform. It leverages DataLad to store metadata and references to data files distributed in various storage spaces and accessible depending on each data owner's policy.

The instructions below explain how to find and get data from the dataset. You can also add data by following the instructions in our contribution guidelines. We welcome your feedback! 😃

Dataset structure

projects contains sub-datasets for projects.

Projects are responsible for the management and curation of their own sub-datasets.

Installing required software

git

sudo apt-get install git

It is useful to configure your git credentials to avoid having to enter them repeatedly:

git config --global user.name "yourusername" git config --global user.email "[email protected]"

git-annex

First install the neurodebian package repository:

sudo apt-get install neurodebian

Then install the version of git-annex included in this repository:

sudo apt-get install git-annex-standalone

The version of git-annex installed can be verified with:

git annex version

As of May 12 2020, this installs git annex v 8.20200330, which works with CONP datasets. Earlier versions of git-annex may not.

DataLad:

sudo apt-get install datalad

Getting the data

Install the main CONP dataset on your computer:

datalad install -r http://github.com/CONP-PCNO/conp-dataset

Get the files you are interested in:

datalad get <file_name>

This may require authentication depending on the data owner's configuration.

You can also search for relevant files and sub-datasets:

datalad search T1

Tests

Execute python tests/create_tests.py from the root of conp-dataset repository
Run pytest tests/ to execute tests for all datasets in projects and investigators
To run specific test on specific datasets, run pytest tests/test_<name of dataset> like pytest tests/test_projects_SIMON-dataset

For detailed explanations of the tests, please consult the test suite documentation.

Coding standards

To keep the Python code maintainable and readable a suite of QA pipelines is testing the code assuring code standards. Pull requests will trigger a GitHub workflow executing pre-commit.

To execute pre-commit locally, you will need to install pre-commit using your favorite method. Then, run:

pre-commit install

pre-commit run --all-files

Pre-commit won't let you commit until reported issue are fixed. If problematic, you can optionally skip the pre-commit for a local commit using the --no-verify flag when commiting, however this will still perform QA test on your PR.

jiantaiz / conp-dataset Goto Github PK

conp-dataset's Introduction

CONP dataset

Dataset structure

Installing required software

git

git-annex

DataLad:

Getting the data

Tests

Coding standards

conp-dataset's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent