robbisg / pyitab Goto Github PK

View Code? Open in Web Editor NEW

1.0 4.0 0.0 348.69 MB

Detergent for Multivariate Analysis Pipelines of Neuroimaging data in Python

Home Page: https://sekupy.readthedocs.io/

License: Other

Python 99.90% Dockerfile 0.10%

fmri-analysis machine-learning multivariate-analysis neuroimaging python

pyitab's Introduction

sekupy

sekupy is a python-package created for deterging your (dirty) (and) (multivariate) neuroimaging analyses. The package has been thought for decoding analyses but it includes also basic univariate analyses.

It has some utilities to vary sets of parameters of the analyses without struggling with for and if statements.

It deterges your results, by saving them in a safe manner, by also keeping in mind BIDS.

sekupy is the deterged version of pyitab.

Documentation

The documention can be found here.

Install

You can install it by using:

pip install sekupy

Example

The main idea is to use a dictionary to configure all parameters of your analysis, feed the configuration into an AnalysisPipeline object, call fit to obtain results, then save to store in a BIDS-ish way.

For example if we want to perform a RoiDecoding analysis using some preprocessing steps we will have a script like this (this is not a complete example):

from sekupy.analysis.configurator import AnalysisConfigurator
from sekupy.analysis.pipeline import AnalysisPipeline
from sekupy.analysis.decoding.roi_decoding import RoiDecoding

_default_config = {
                    # Here we specifiy that we have to transform the dataset labels
                    # then select samples and then balance data
                    'prepro': ['target_transformer', 'sample_slicer', 'balancer'],
                    
                    # Here we set which attribute to choose (dataset is a pymvpa dataset)
                    'target_transformer__attr': "image_type",
                    # Here we select samples with a image_type equal to I or O and evidence equal to 1
                    'sample_slicer__attr': {'image_type':["I", "O"], 'evidence':[1]},
                    # Then we say that we want to balance image_type at subject-level
                    "balancer__attr": 'subject',

                    # We setup the estimator in a sklearn way
                    'estimator': [
                        ('fsel', SelectKBest(k=50)),
                        ('clf', SVC(C=1, kernel='linear'))],
                    'estimator__clf__C': 1,
                    'estimator__clf__kernel': 'linear',
                    
                    # Then the cross-validation object (also sklearn)
                    'cv': LeaveOneGroupOut,
                    
                    'scores': ['accuracy'],
                    
                    # Then the analysis
                    'analysis': RoiDecoding,
                    'analysis__n_jobs': -1,
                    
                    'analysis__permutation': 0,
                    
                    'analysis__verbose': 0,
                    
                    # Here we say that we want use the region with value 1 in image+type mask
                    'kwargs__roi_values': [('image+type', [1]), ('image+type', [2]), ('image+type', [3]),
                                            ('image+type', [4]), ('image+type', [5])],
                    
                    # We want to use subject for our cross-validation
                    'kwargs__cv_attr': 'subject'
                    }

configuration = AnalysisConfigurator(**_default_config), 
                                     kind='configuration') 
kwargs = configuration._get_kwargs() 
a = AnalysisPipeline(conf, name="roi_decoding_across_full").fit(ds, **kwargs) 
a.save()

Surf the code, starting from classes used here!!

pyitab's People

Contributors

Stargazers

Watchers

pyitab's Issues

Frame Displacement Transformer

Build a transformer that performs censorship and return the censored dataset. (Power et al 2012)

SearchLight save_partial doesn't save image when finishes

When we set save_partial = True the software stores the partial accuracies map but then it doesn't save the entire map, and doesn't delete partial files as well.

ValueError: could not convert string to float: 'L'

Using dask branch

BIDS reader

Build a reader to get files in BIDS and use pymvpa ds.

sample_slicer bug in AnalysisIterator with configuration

When we use AnalysisIterator(kind='configuration') it may happen that sample_slicer__attribute is not overwrited, sometimes it is ok since we want to slice different attributes but sometimes not.

One solution could be to use a dictionary instead of different sample_slicer__attribute keys.

Change deprecated return_indices for imblearn

New in version 0.4: sample_indices_ used instead of return_indices=True.

Add Dataset Loading into Configurator

Regressor Transformer

Build a Transformer that performs Regression / GLM (see nistats)

Cross-decoding

Cross-decoding should be performed with an ad-hoc split of examples during cross-validation:

Things to do:

Build a custom partitioner that separates labels from an experiment to train and other to test.

Improve memory management

https://realpython.com/python-memory-management/

https://www.codementor.io/satwikkansal/python-practices-for-efficient-code-performance-memory-and-usability-aze6oiq65

https://chase-seibert.github.io/blog/2013/08/03/diagnosing-memory-leaks-python.html#

Loading of movement attributes

How to manage

Issues

From @robbisg on June 13, 2018 12:38

PermutatorTransformer
Save function for Transformer
Cross-decoding How to
Within/Across Searchlight

Copied from original issue: robbisg/mvpa_itab_wu#24

Improve file searching in load data

This is to uniform and avoid error when loading when the dir is full of different files.

Use glob and regular expression.

States

From @robbisg on June 13, 2018 12:11

Implement states.

Build StateAnalyzer

Copied from original issue: robbisg/mvpa_itab_wu#22

Build a function to check ```data_path``` configuration field

data_path composition is delegated to load_ds, but to check load_subjects or maybe other functions, maybe is better to build a function build_data_path from configuration file to help doing this!

Build a MetaAnalyzer to easily setup an analysis.

This is used to setup _default_options and _default_config when we iterate.
And this is also to setup CrossValidation and Analyziz parameter.

List of all possible analysis.
List of parameters.

Implement an estimator for multivariate connectivity/DVARS

From @robbisg on April 12, 2018 10:3

Copied from original issue: robbisg/mvpa_itab_wu#16

FileNotFoundError in get_results

File "/home/robbis/git/joblib/joblib/externals/loky/process_executor.py", line 418, in _process_worker
r = call_item()
File "/home/robbis/git/joblib/joblib/externals/loky/process_executor.py", line 272, in call
return self.fn(*self.args, **self.kwargs)
File "/home/robbis/git/joblib/joblib/_parallel_backends.py", line 567, in call
return self.func(*args, **kwargs)
File "/home/robbis/git/joblib/joblib/parallel.py", line 225, in call
for func, args, kwargs in self.items]
File "/home/robbis/git/joblib/joblib/parallel.py", line 225, in
for func, args, kwargs in self.items]
File "/home/robbis/git/pyitab/pyitab/analysis/results/base.py", line 23, in get_values
with open(conf_fname) as f:
FileNotFoundError: [Errno 2] No such file or directory: configuration.json'

Use external packages for nilearn and scikit-learn modified libraries

Remove hard-coded configuration.json from results.bids package

In get_values and get_*_results we have configuration.json that is hard-coded, we need to first generate a BIDS-valid json and then obtain it from the informations.

Balancer utils package

From @robbisg on June 5, 2018 13:14

Document balancer.utils

Copied from original issue: robbisg/mvpa_itab_wu#19

Information about atlases

In utils.atlas, the developed functions import data from a local directory.

FrequencyFilter Transformer

Build a transformer that filters typical frequencies for fMRI.

Generate an id for the Analyzer class

The Analyzer class should generate an id to be used in conjunction with analysis.name.

If we use the iterator, the responsibility of the id generation should be on the shoulders of the iterator?

Coverage

`analysis`

`configurator.py`

test _get_fname_info
test _get_kwargs (maybe deprecated)
test save

`pipeline.py`

Test fit
test lines 53/54
test save

`iterator.py`

Test iterator
test combination, list and dict types

`base.py`

test save

`results/base.py`

test get_values
test get_results
test filter_dataframe
test df_fx_over_keys

`io`

`subjects.py`

Test with selected_subjects

`configuration.py`

Test save_configuration

Loading package

Test fetch with n_subjects=1
Test load_ds with subjects without data (e.g. monks data)

Scoring Issue in TemporalDecoding

ValueError: Classification metrics can't handle a mix of binary and unknown targets

If scoring is set!

Working around with setting self.scoring = None in temporal_decoding.py at line 100.

Write results functions.

From @robbisg on June 5, 2018 10:23

I need to write functions for writing results.

Questions:

Where use path?
Use a different saver for the transformers/analyzers?

Searchlight results

Maps for each cv
Average map if a cv within subject is performed
Merged map for a statistical test
Files / something for AFNI test
Permutation maps ?

Decoding

Connectivity

States

Cross-decoding

Copied from original issue: robbisg/mvpa_itab_wu#18

New features

Permutation

Is it better to use a transformer? #82

Cross-decoding

Build a custom CrossValidator

Transformers

Connectivity

See #60

States

Build StateAnalyzer
Build Transformers for states
Import metrics from mvpa_itab

Results

Refactor results using BIDS format of derivatives #27
Cope with subject-wise results.

Configurator, Pipelines and Iterator

Create a Configurator class that is general. #30
Create a DecodingConfigurator class that inherits from Configurator. #30
Add Dataset Loading into Configurator #31
Study the possibility to use iterator to run subject-wise analysis.

Deprecated

Evaluate load_spatiotemporal_dataset
Evaluate read_configuration_json
Evaluate read_remote_configuration

Balancer for TimeGeneralizationDecoding

Implement a Balancer for TimeGeneralizationDecoding.

Averager/PCA/MVPC Transformer

We need a transformer that using fa attributes of rois or something else transforms data by averaging/ getting PCs or by MVPC.

Remove underscore from results directories

Substitute underscore with plus or point

Error on detrending

ValueError: Cannot detrend the dataset, since it neither provides location information of its samples in the space spanned by the polynomials, nor does it match the number of samples this this mapper has been trained on. (got: 360 and was trained on 240).

TypeError: init() got multiple values for keyword argument 'name'

Solve this issue in propagating name field!

Single subject searchlight has group as first bids name part

IndexError: too many indices for array

pyitab/analysis/searchlight/init.py in _split_name(self, X, y, cv, groups)
171 X = X[..., 0]
172
-->173 split = [np.unique(groups[:, 1][test])[0] for train, test in cv.split(X, y=y, groups=groups)]
174 return split

It depends on group attribute which sometimes could be a list, in that case it should be used a different solution to be thinked.

Add sentry signal when an analysis finishes or crashes

import sentry_sdk
sentry_sdk.init("https://[email protected]/1439199")

from sentry_sdk import capture_exception

try:
    a_potentially_failing_function()
except Exception as e:
    # Alternatively the argument can be omitted
    capture_exception(e)

Analyses with same id and different subj must be in the same dir

Remove num from the dir created and add sub folder to that analysis.

find if sample_slicer__subject is in iterator's default_options.keys()

Correlation / Coherence Analyzer

Build an Analyzer that performs correlation or coherence analysis!

Take care of also MEG data!

Loading of maps for connectivity (GM, WM etc.)

How to manage this?

For BIDS and for other ds.

Iterator is impossible to use when target_transformation and sample_slicer are dependent.

If we need to change the target and slice the dataset according to target we need to implement a method to cope with this.

e.g. {'sample_slicer__%s':['L', 'F']} where? I don't know!

Within subject analyses

How to cope with this??

Use the iterator, by iterating the fetch (#31) ➡️ maybe problems in saving (#27) (but we can use the AnalysisPipeline save subdir function!)
Maybe is better to develop #27
For some analyses (connectivity) we need to use the iterating stuff.

Create a general Configurator class.

This class is responsible of:

Loading dataset loader
Save configuration
Prepare all of the important objects for AnalysisPipeline class

Specific analysis configuration should inherit from this class!

TemporalGeneralizatorTransformer

From @robbisg on June 13, 2018 12:25

Build TemporalGeneralizatorTransformer and Analyzer

Copied from original issue: robbisg/mvpa_itab_wu#23

TypeError("init() got multiple values for keyword argument 'name'")

If ds.sa has an attribute name it will raise this error!

Convert results dir structure in a bids-ish way!

Problem

Use BIDS specs to create dirs and so on!

Every analysis must be included in derivatives folder
the structure must be
<ds-root>/derivatives/<pipeline-name>/<subj-dir>

in the subj dir the files must be named as:
<source_keyword>[_keyword-value]_<suffix>.<ext>
in addition a dataset-description.json must be included.

The best solution is that <pipeline-name> is the on provided by AnalysisPipeline or by Analyzer
dataset-description.json must include PipelineDescription.Name
Maybe we can use keyword-value in file to specify different analyses.
- A possibility is to use <pipeline-name>-<variant>
Each subdir file should be described by a filename.json file with:

TODO

Fixes

Remove underscore from dir and files names #41
Analyses with sample_slicer__subjects must share same derivatives dir #42
Results reader
Remove hard-coded configuration.json #43

Within subject searchlight

From @robbisg on June 5, 2018 16:2

The problem is that I need to do within-subject searchlight using the AnalysisScript and save results.

Possible solutions:

Use script iterator that generates a unique id, then collect results using this id.
Insert a name in the ScriptConfigurator or AnalysisIterator that should be used by AnalysisScript to
generate a folder to store data.

Copied from original issue: robbisg/mvpa_itab_wu#20

Add dataset_description.json to result folder

This lets the BIDS package to index the files 💪

Connectivity Analyses

From @robbisg on June 13, 2018 9:37

I need to implement several things to perform a connectivity analysis

Transformers

Averager across ROI
PCA
Multivariate distance
...

Analysis / Measures (look nilearn)

Seed based Correlation
ICA

Preprocessing / Trasformers?

GLM regressor

Copied from original issue: robbisg/mvpa_itab_wu#21

Searchlight partial results

The problem with searchlight is that some results may be lost if machines go down.
Maybe joblib Memory utility is useful to store partial results.

Transformers must add their info on transformed ds

As the mapper for pymvpa dasets.

robbisg / pyitab Goto Github PK

pyitab's Introduction

sekupy

Documentation

Install

Example

pyitab's People

Contributors

Stargazers

Watchers

pyitab's Issues

analysis

configurator.py

pipeline.py

iterator.py

base.py

results/base.py

io

subjects.py

configuration.py

Loading package

Searchlight results

Decoding

Connectivity

States

Cross-decoding

Permutation

Cross-decoding

Transformers

Connectivity

States

Results

Configurator, Pipelines and Iterator

Deprecated

Problem

TODO

Fixes

Recommend Projects

Recommend Topics

Recommend Org

`analysis`

`configurator.py`

`pipeline.py`

`iterator.py`

`base.py`

`results/base.py`

`io`

`subjects.py`

`configuration.py`