brainiak / brainiak-tutorials Goto Github PK

Tutorials that cover topics from basics to advanced fMRI analysis

License: Apache License 2.0

Jupyter Notebook 3.79% Shell 0.05% Python 0.33% HTML 95.83%

brainiak-tutorials's Introduction

Brain Imaging Analysis Kit

The Brain Imaging Analysis Kit is a package of Python modules useful for neuroscience, primarily focused on functional Magnetic Resonance Imaging (fMRI) analysis.

The package was originally created by a collaboration between Intel and the Princeton Neuroscience Institute (PNI).

To reduce verbosity, we may refer to the Brain Imaging Analysis Kit using the BrainIAK abbreviation. Whenever lowercase spelling is used (e.g., Python package name), we use brainiak.

Quickstart

If you have Conda:

conda install -c brainiak -c defaults -c conda-forge brainiak

Otherwise, or if you want to compile from source, install the requirements (see docs/installation) and then install from PyPI:

python3 -m pip install brainiak

Note that to use the brainiak.matnormal package, you need to install additional dependencies. As of October 2020, the required versions are not available as Conda packages, so you should install from PyPI, even when using Conda:

python3 -m pip install -U tensorflow tensorflow-probability

Note that we do not support Windows.

Docker

You can also test BrainIAK without installing it using Docker:

docker pull brainiak/brainiak
docker run -it -p 8888:8888 -v brainiak:/mnt --name demo brainiak/brainiak

To run Jupyter notebooks in the running container, try:

python3 -m notebook --allow-root --no-browser --ip=0.0.0.0

Then visit http://localhost:8888 in your browser and enter the token. Protip: run screen before running the notebook command.

Note that we do not support MPI execution using Docker containers and that performance will not be optimal.

Support

If you have a question or feedback, chat with us on Gitter or email our list at [email protected]. If you find a problem with BrainIAK, you can also open an issue on GitHub.

Examples

We include BrainIAK usage examples in the examples directory of the code repository, e.g., funcalign/srm_image_prediction_example.ipynb.

To run the examples, download an archive of the latest BrainIAK release from GitHub. Note that we only support the latest release at this moment, so make sure to upgrade your BrainIAK installation.

Documentation

The documentation is available at http://brainiak.org/docs.

Contributing

We welcome contributions. Have a look at the issues labeled "easy" for starting contribution ideas. Please read the guide in CONTRIBUTING.rst first.

Citing

Please cite BrainIAK in your publications as: "Brain Imaging Analysis Kit, http://brainiak.org." Additionally, if you use RRIDs to identify resources, please mention BrainIAK as "Brain Imaging Analysis Kit, RRID:SCR_014824". Finally, please cite the publications referenced in the documentation of the BrainIAK modules you use, e.g., SRM.

brainiak-tutorials's People

Contributors

Stargazers

Watchers

brainiak-tutorials's Issues

installing brainiak using Conda

Greetings,

I am trying to install brainiak using conda but I get the following error. Could you please advise how to solve the issue?
The command I use is:
"conda install -c brainiak -c defaults -c conda-forge brainiak"
and the error is
"
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

brainiak

Current channels:

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.
"

Thanks much,
Mheid

Brainiak installation erased when opening a new collab tab

Hello,

I tried following the installation steps and installed brainiak in the dedicated collab notebook. However, when I opened another tab to run a different brainiak tutorial notebook, I got an error that python couldn't find the nilearn or brainiak modules. What should I have done differently?

Thank you!

03-classification: Updates

These are some suggestions from a reviewer of the notebook to enhance 03-classification:

The stimulus timing figure in section 1.2 should be labeled as time (s), not TR
Switching back and forth between measuring time in TRs or seconds is confusing. It might be useful to explicitly address this. In section 1.3 there is a hypothetical experiment with a TR of 2, which further adds to the potential confusion. Personally, I'd use seconds when displaying any continuous function, and TRs for discrete events, such as stimuli locked to the acquisition
The figure in section 1.3.2 is nice because the rest periods are explicitly marked. The second figure has the x-axis labeled as TR, when it should be time (seconds). The scale is apparently inconsistent with the array sizes.

Running FCMA - RuntimeError

Hello all,

I just started playing with brainiak. I'm trying to go through the tutorial and example dataset for FCMA locally. I have set up brainiak as instructed. When I run the bash script run_fcma_voxel_selection_cv.sh, I got the error saying

raise RuntimeError('one process cannot run the '
RuntimeError: one process cannot run the master-worker model

I checked the code this error was pointing to, it looks it is caused by MPI.COMM_WORLD.Get_size() == 1. I assume it means that it runs only 1 thread. So how/where should I specify that I want more cores to be involved (laptop has 8 cores, 16G RAM)? I have tried to change the value of the variable OMP_NUM_THREADS in run_fcma_voxel_selection_cv.sh, but it did not help.

Any insights would be super helpful, thank you all very much in advance!

Document minimum Python version required

Note that currently this is different from what BrainIAK requires. BrainIAK works with Python 3.5, but at least one of the packages used in the tutorials, Nxviz, requires Python 3.6:
https://github.com/ericmjl/nxviz#requirements

[Tutorial 1] Suggestion: use a neuroimaging viewer

Instead of matplotlib, I believe that this tutorial would be a lot more effective using nipy/niwidgets or nibabel's orthoview. The rationale is that niwidgets will allow students to navigate the 3D data.

02-data-handling: Updates

These are some suggestions to improve the 02-data-handling notebook:

There is a reference to the glossary in the introduction, but the glossary was quite far down in the document. Can you add a hyperlink?
In the glossary, "(TR) is the time interval at which pulses occur and signal is collected". It's not obvious what the pulses are. From the perspective of fMRI, the TR is the time in seconds between acquisition of volumes of data.
Consider using the term "Functional Localizer" to "Localizer".
Using line graphs in section 2.2 might suggest that the stimuli are continuous. In particular, the rest periods are not marked.
Remove third 't' from 'patttern' in section 4.1

Is there a solution to exercises?

I was doing exercise 8 from the Data Handling tutorial, but I'm not sure if I did it correctly. Is there a solution to this exercise?

License

Please add a license to clearly state the reuse potential of these tutorials

[Tutorial +2] What is a localizer dataset?

The concept of localizer is used but never described. A little note describing what kinds of analysis and the design of localizer datasets would be very much appreciated in Tutorials 2 and 3.

A new 'dimensionality reduction' slide needed

It's just a small issue. The dimensionality slide link is not working. An update link or a new slide might be needed.

04-dimensionality reduction

Some suggestions to enhance the 04-dimensionality reduction notebook:

Personally, I prefer a different formulation for correlation. See eq.3 in https://en.wikipedia.org/wiki/Pearson_correlation_coefficient. It's also easier to calculate since all the (n-1)s cancel
In section 3.3 I had problems coloring the scatter plot by label - one label was always missing when I added c = labels to the scatter call. I eventually tracked this down the the colormap, with one label ending up white. Adding cmp='plasma_r' sorted it out. I think that this issue came from the colorblind palette
In exercise 8 (PCA challenge), I'm not sure what is being asked for. Given that with n=7 I get perfect accuracy, I could try dropping components. Should this be a manual search, or an exhaustive search of all combinations of 1, 2, 3, 4, ... components?
In the univariate feature selection section, using the anova F as feature selection before doing SVC seems very circular. While it shouldn't contaminate the test set, it seems inevitable that it will give perfect performance on training. This is of course different to PCA, which is blind to the labels. This would seem an example for nested cross-validation in the next section

[Tutorial 11] Input shape usage of ISC in tutorial is not the same as it described in docstring

In code cell [12] of tutorial 11,

# Reorganize the data back into an appropriate space for ISC
raw_obj = np.zeros((train_data[0].shape[0], train_data[0].shape[1], len(train_data)))
for ppt in range(len(train_data)):
    raw_obj[:, :, ppt] = train_data[ppt]
    
# Perform ISC on all participants, collapsing across participants    
corr_raw = isc(raw_obj, summary_statistic='mean')
corr_raw = np.nan_to_num(corr_raw)

the input shape of raw_obj to func isc is (n_voxel, n_TRs, n_subjects).

However, I find it different in docstring of isc:

    Parameters
    ----------
    data : list or ndarray (n_TRs x n_voxels x n_subjects)
        fMRI data for which to compute ISC
...
    Returns
    -------
    iscs : subjects or pairs by voxels ndarray
        ISC for each subject or pair (or summary statistic) per voxel

It's clear that in tutorial 11 the n_voxel and n_TRs of raw_obj are in a reversed positions.
Is it with a special concern for special purpose, or just a mistake?

Thanks

FCMA batch scripts do not run on non-cluster environments

I'm not entirely sure to what extent this is by design, but here goes:
when I attempted to run tutorials/09-fcma/run_fcma_voxel_selection_cv.sh as part of the FCMA tutorial it failed with the following error message:

Programming starts in 1 process(es)
pixdim[0] (qfac) should be 1 (default) or -1; setting qfac to 1
2020-05-28 13:51:13,542 - nibabel.global - INFO - pixdim[0] (qfac) should be 1 (default) or -1; setting qfac to 1
2020-05-28 13:51:13,548 - brainiak.fcma.preprocessing - INFO - start to apply masks and separate epochs
Traceback (most recent call last):
  File "./fcma_voxel_selection_cv.py", line 78, in <module>
    vs = VoxelSelector(labels_subsampled, epochs_per_subj, num_subjs - 1, raw_data_subsampled)
  File "/home/cjunger/my-envs/mybrainiak/lib/python3.6/site-packages/brainiak/fcma/voxelselector.py", line 138, in __init__
    raise RuntimeError('one process cannot run the '
RuntimeError: one process cannot run the master-worker model

Of course the culprit here is that, in non-cluster configurations, the script calls python ./fcma_voxel_selection_cv.py without going through mpirun (as per the FCMA example scripts in the main brainiak package). prepending the python call with mpirun -np 2 fixes the issue.

Similarly, the script run_fcma_classification.sh is set up to completely refuse to run on non-cluster configurations. perhaps adding an option to run it using mpirun -np 2 (with a lower number for OMP_NUM_THREADS perhaps) could be added? I'd be happy to submit a pull request if you'd like.

Nilearn surface plotting incompatible with matplotlib 3.1

Surface plotting (e.g. 10-ISC) errors out, see here: neurolibre#6

The current workaround is to downgrade to an older version of matplotlib==3.0.3.

The fix for this issue will be released in the next version of nilearn.

Suggestions to Dockerfile

Three tiny suggestions:

Clean up apt cache after

brainiak-tutorials/Dockerfile

Line 30 in 64576de

libssl-dev

, adding:

    && \
    apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Remove line 32, by updating line 33 with

COPY brainiak /mnt/brainiak

Use the --no-cache-dir option of pip.

python3 -m pip install --no-cache-dir --user -U -r tutorials/requirements.txt

and so on, so forth.

[Tutorial 3] Suggestion: link to scikit-learn about cross-validation

https://brainiak.org/notebooks/tutorials/html/03-classification.html#2.-Classification would improve just by linking to https://scikit-learn.org/stable/modules/cross_validation.html

signal reconstruction weights in SRM tutorial

During signal reconstruction, when reconstructing the voxels for the full dataset, the tutorial applies the weights of subject 1 to the remaining subjects, rather than creating subject specific weightings.

signal_srm = np.zeros((test_data[0].shape[0], test_data[0].shape[1], len(test_data)))
for ppt in range(len(test_data)):
    signal_srm[:, :, ppt] = w0.dot(shared_test[ppt]) ##here the w0 array should be estimated for each subject

a potential fix could be

signal_srm = np.zeros((test_data[0].shape[0], test_data[0].shape[1], len(test_data)))
for ppt in range(len(test_data)):
    w = srm.w_[ppt]
    signal_srm[:, :, ppt] = w.dot(shared_test[ppt])

[Tutorial 3] Clearly state the limitations of LORO as such

I would suggest changing the title here

brainiak-tutorials/tutorials/03-classification.ipynb

Line 814 in 32f4e32

 "**Using LORO ensures that the classifier is tested on an unseen, independent dataset.**\n", 

with

**LORO is a good first step for cross-validation, but it is too limited for most real-world analyses.**\n

Then, the following paragraph should highlight why this is limited and why it is a better idea to leave one participant out, for instance.

01-Setup: Add information on how to determine type and size of object

It might be useful to include more information about determining the types and sizes of python objects. Combining python lists and numpy arrays can be a bit confusing.

import error-vdc mask

ImportError Traceback (most recent call last)
/scratch/local/ipykernel_188741/2146300536.py in
1 # load some helper functions
----> 2 from utils import load_vdc_mask, load_vdc_epi_data, load_vdc_masked_data
3 from utils import vdc_data_dir, vdc_all_ROIs, vdc_label_dict, vdc_n_runs, vdc_hrf_lag, vdc_TR, vdc_TRs_run # load some constants
4
5 get_ipython().run_line_magic('matplotlib', 'inline')

ImportError: cannot import name 'load_vdc_mask' from 'utils' (/home/pinarde/.conda/envs/venv/lib/python3.7/site-packages/utils/init.py)

FCMA tutorial CircosPlot epoch_coords

Hello all,

I have got to the plotting part of the FCMA tutorial. Everything so far makes perfect sense until the example code reads in the np array epoch_corr_coords.npy. I don't think other parts of the tutorial mentioned anything about what this array is and how it was generated. I'm new to CircosPlot and get confused when reading the code. It would be easier to navigate the code if any insights could be given on what information does this array contain and how it stores the info (i.e.,its shape).

It is the last code chunk in the FCMA tutorial, and I'm referring to the epoch_coords variable which reads in epoch_corr_coords.npy.

# %matplotlib inline

# What is the (absolute) correlation threshold
threshold = 0.95

# Load in the data
plot_out_dir = os.path.join(output_dir, 'plotting_out')
epoch_corr = np.load(os.path.join(plot_out_dir,"epoch_corr.npy"))
epoch_coords = np.load(os.path.join(plot_out_dir,"epoch_corr_coords.npy"))

# Preset the graph
G = nx.Graph()

# Create the edge list
nodelist = []
edgelist = []
for row_counter in range(epoch_corr.shape[0]):
    nodelist.append(str(row_counter))  # Set up the node names
    
    for col_counter in range(epoch_corr.shape[1]):
        
        # Determine whether to include the edge based on whether it exceeds the threshold
        if abs(epoch_corr[row_counter, col_counter]) > threshold:
            # Add a tuple specifying the voxel pairs being compared and the weight of the edge
            edgelist.append((str(row_counter), str(col_counter), {'weight': epoch_corr[row_counter, col_counter]}))
        
# Create the nodes in the graph
G.add_nodes_from(nodelist)

# Add the edges
G.add_edges_from(edgelist)

# Set the colors and grouping (specify a key in a dictionary that can then be referenced)
for n, d in G.nodes(data=True):
    
    # Is the x coordinate negative (left)
    if epoch_coords[0][int(n)] < 0:
        if epoch_coords[1][int(n)] < 0:
            G.node[n]['grouping'] = 'posterior_left'
        else:
            G.node[n]['grouping'] = 'posterior_right'
    else:
        if epoch_coords[1][int(n)] < 0:
            G.node[n]['grouping'] = 'anterior_left'
        else:
            G.node[n]['grouping'] = 'anterior_right'

# plot the data
c = CircosPlot(graph=G, node_grouping='grouping', node_color='grouping', group_label_position='middle',figsize=(10,6))
c.draw()
plt.title('Circos plot of epoch data')

Thank you all very much in advance!

[Tutorial 13] Figure output 14 - use niwidgets?

Using niwidgets to plot this figure would allow the student to navigate the weights.

[Tutorial 6] Please use a diverging colormap to plot correlations

... because these tutorials try to observe good practices ;)

EDIT: and if linear, diverging colormaps don't show clear differences that need to be seen, then make data transformation explicit (e.g., log on the colormap).

Reuse downloaded data and tutorial in Colab

Hi, I need to study searchlight and I want to use the BrainIAK tutorial on colab. Is there a way we don't have to run the following code and download the datasets every time? Because I may study this tutorial many times, it is not sufficient for me to run and download the corresponding data every time since it uses a lot of data!

  !pip install deepdish ipython matplotlib nilearn notebook pandas seaborn watchdog
  !pip install pip\<10
  !pip install -U git+https://github.com/brainiak/brainiak
  !git clone https://github.com/brainiak/brainiak-tutorials.git
  !cd brainiak-tutorials/tutorials/; cp -r 07-searchlight 09-fcma 13-real-time utils.py setup_environment.sh /content/
  !mkdir /root/brainiak_datasets

Tutorial fixes

Tutorial 2 - data-handling

attention instead of 'attendtion'
compare instead of 'compared' in Exercise 5

Tutorial 3 - Classification

for the comment '# Calculate the accuracy for the hold out run' , it should be 'It should be calculating the accuracy on the training set itself' as the model is double dipping on the same set without division between training and test.

Tutorial 4 - Dimensionality Reduction

The link in 'The steps below are based on this example in scikit-learn' does not work. Returns 404 error

Tutorial 5 - Classifier Optimization

For cost C hyperparameter, the dataset shows rank_test_score = 1 for all values of C > 0.0001, so hyperparameter does not affect the fit score (rank) strongly. Some C values less than 0.0001 can be added in the tutorial to see some difference in fit due to C value.
L1 regularisation in logistic regression does not work as default solver has changed from ‘liblinear’ to ‘lbfgs’ in Scikit learn 0.22. This can be solved by adding solver = 'liblinear'
logreg_l1 = LogisticRegression(penalty='l1',solver = 'liblinear')
The link in 'The steps below are based on this example in scikit-learn' does not work.

05-Optimzation: Enhancements and fixes

The following items can be improved in this notebook:

The early sections "Recap" and "Dataset" are almost identical, so redundant
Exercise 1
Presumably the expectation is to separate the train/test sets for the classifier and also for the voxel selection. It might be worth emphasizing that using all the data for voxel selection is a common but subtle error. There are probably quite a few good examples in the literature that got past less technical reviewers
In this example, I consistently get slightly below chance performance. I believe that this is driven by the cross-validation, see:
Classification based hypothesis testing in neuroscience: Below‐chance level classification rates and overlooked statistical properties of linear parametric classifiers. HBM 2016
Another subtle example of bias is given in the following by Watts et al 😊 Potholes and Molehills: Bias in the Diagnostic Performance of Diffusion-Tensor Imaging in Concussion. Radiology 2014
In 3.1 Grid search
Strictly, the dependence of the number of combinations on granularity of the grid search is not exponential
3.2 Regularization Example: L2 vs L1
L1 regularization now requires solver='saga' in LogisticRegression call for L1 penalty. This is probably a change in the default behavior of Scikit Learn
4. Build a Pipeline
As with 3.1, there seem to be a lot of parameters that give perfect accuracy. Maybe classifying by blocks is too easy, and the number of blocks is relatively low, so big steps in accuracy
c_steps = [10e-1, 10e0, 10e1, 10e2] is confusing notation for exponents

[Tutorial 11] Missing image

https://brainiak.org/notebooks/tutorials/html/imgs/lab11/srm_time_segment_matching.png

Current RSA tutorial does not reflect BRSA

I think this should be at least declared in the tutorial, in case people who are looking for BRSA examples get confused.

Google Colab: cannot run two notebooks in one runtime environment.

Details are in the link: googlecolab/colabtools#773

Google has changed their functionality for Colab. We will need to update instructions on the tutorials website and provide a workaround. For now, a user needs to copy the cells from our colab setup notebook and paste them into the notebook that they want to run, before executing the notebook.