Git Product home page Git Product logo

plstuto's People

Contributors

allcontributors[bot] avatar diiobo avatar htwangtw avatar leonieborne avatar nadinespy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

plstuto's Issues

Tutorial 2. Data reduction

Tutorial 2. Data reduction

Would you like to participate in the writing of this tutorial?
Or do you have a question about this tutorial?
Let us know here!

Description

This tutorial focus on dimensionality-reduction techniques (PCA, ICA, etc.) that can provide useful data preprocessing when the number of variables exceeds the number of samples.

Useful references

  • Section 5.2 CCA for neuroscientists: Wang, Hao-Ting, et al. "Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists." NeuroImage (2020)

Future of this project

If people are thinking of continuing the project I am happy to advise and comment in discussions. I am wondering if we can have more people as contributors and monitor the project? @LeonieBorne will remain as the owner of the project. @nadinespy seems really keen and I do think it's great to have more people with fresh eyes.

On the practical side, I would highly recommend people adding dependencies so people can work on the same versions of libraries.

Tutorial 3. Model selection

Tutorial 3. Model selection

Would you like to participate in the writing of this tutorial?
Or do you have a question about this tutorial?
Let us know here!

Description

This tutorial introduce to the different techniques used to evaluate/validate/select the model.

  • How to choose the optimal number of latent sources of variation to be extracted?
  • How to evaluate the contribution of each individual input variable to the overall modeling solution?
  • How to compare the models?

Useful references

  • Section 5.3 CCA for neuroscientists: Wang, Hao-Ting, et al. "Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists." NeuroImage (2020)
  • Section 4.6 PLS-PM: "PLS Path Modeling with R" Gaston Sanchez
  • Comparison CCA/PLS: Rahim, Mehdi, Bertrand Thirion, and Gaël Varoquaux. "Multi-output predictions from neuroimaging: assessing reduced-rank linear models." 2017 International Workshop on Pattern Recognition in Neuroimaging (PRNI). IEEE, 2017.
  • Permutation inference for CCA: Winkler, Anderson M., et al. "Permutation inference for Canonical Correlation Analysis." arXiv preprint arXiv:2002.10046 (2020).

Databases to showcase

In order to write the different tutorials, we need open access databases to play with. Feel free to suggest here if you have any ideas, or to start looking for one on OpenNeuro!

Tutorial 1. Data preprocessing

Tutorial 1. Data preprocessing

Would you like to participate in the writing of this tutorial?
Or do you have a question about this tutorial?
Let us know here!

Description

This tutorial focus on minimal data preprocessing, usually required as for most machine-learning methods, with among other things:

  • z-scoring of each variable,
  • outlier detection,
  • missing values processing,
  • deconfounding procedures.

Useful references

  • Section 5.1 CCA for neuroscientists: Wang, Hao-Ting, et al. "Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists." NeuroImage (2020)

Roadmap

Roadmap

This issue contains the roadmap of this project. It's a place to start to investigate the issues that you can contribute to.

Please note that the list of tutorials proposed are by no means exhaustive. If you wish to add/modify some of them, do not hesitate to suggest it by creating a new issue!

General

Here is a (non-exhaustive) list of points to be dealt with before/during/after the tutorials have been written.

Tutorial 0. Introduction #5

The objective of this introductory tutorial is to explain the general principles of cross-decomposition algorithms, their possible applications and practical considerations. It should introduce and refer to the other tutorials.

This tutorial should also give an overview of the different cross-decomposition algorithms that exist, including CCA, PLS regression, PLS canonical, PLS-PM (for more than 2-blocks of variables), etc.

Useful references

  • Cross-decomposition in scikit-learn: scikit-learn documentation for the cross-decomposition module (CCA, PLS regression, PLS canonical). Note that the documentation should be updated soon (see current pull request, corresponding branch).
  • PLS-PM: "PLS Path Modeling with R" Gaston Sanchez
  • PLS-PM in Python
  • PLS methods for neuroimaging: Krishnan, Anjali, et al. "Partial Least Squares (PLS) methods for neuroimaging: a tutorial and review." Neuroimage 56.2 (2011): 455-475.
  • CCA for neuroscientists: Wang, Hao-Ting, et al. "Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists." NeuroImage (2020)

Tutorial 1. Data preprocessing #6

This tutorial focus on minimal data preprocessing, usually required as for most machine-learning methods, with among other things:

  • z-scoring of each variable,
  • outlier detection,
  • missing values processing,
  • deconfounding procedures.

Useful references

  • Section 5.1 CCA for neuroscientists: Wang, Hao-Ting, et al. "Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists." NeuroImage (2020)

Tutorial 2. Data reduction #7

This tutorial focus on dimensionality-reduction techniques (PCA, ICA, etc.) that can provide useful data preprocessing when the number of variables exceeds the number of samples.

Useful references

  • Section 5.2 CCA for neuroscientists: Wang, Hao-Ting, et al. "Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists." NeuroImage (2020)

Tutorial 3. Model selection #8

This tutorial introduce to the different techniques used to evaluate/validate/select the model.

  • How to choose the optimal number of latent sources of variation to be extracted?
  • How to evaluate the contribution of each individual input variable to the overall modeling solution?
  • How to compare the models?

Useful references

  • Section 5.3 CCA for neuroscientists: Wang, Hao-Ting, et al. "Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists." NeuroImage (2020)
  • Section 4.6 PLS-PM: "PLS Path Modeling with R" Gaston Sanchez
  • Comparison CCA/PLS: Rahim, Mehdi, Bertrand Thirion, and Gaël Varoquaux. "Multi-output predictions from neuroimaging: assessing reduced-rank linear models." 2017 International Workshop on Pattern Recognition in Neuroimaging (PRNI). IEEE, 2017.
  • Permutation inference for CCA: Winkler, Anderson M., et al. "Permutation inference for Canonical Correlation Analysis." arXiv preprint arXiv:2002.10046 (2020).

Tutorial 0. Introduction

Tutorial 0. Introduction

Would you like to participate in the writing of this tutorial?
Or do you have a question about this tutorial?
Let us know here!

Description

The objective of this introductory tutorial is to explain the general principles of cross-decomposition algorithms, their possible applications and practical considerations. It should introduce and refer to the other tutorials.

This tutorial should also give an overview of the different cross-decomposition algorithms that exist, including CCA, PLS regression, PLS canonical, PLS-PM (for more than 2-blocks of variables), etc.

Useful references

  • Cross-decomposition in scikit-learn: scikit-learn documentation for the cross-decomposition module (CCA, PLS regression, PLS canonical). Note that the documentation should be updated soon (see current pull request, corresponding branch).
  • PLS-PM: "PLS Path Modeling with R" Gaston Sanchez
  • PLS-PM in Python
  • PLS methods for neuroimaging: Krishnan, Anjali, et al. "Partial Least Squares (PLS) methods for neuroimaging: a tutorial and review." Neuroimage 56.2 (2011): 455-475.
  • CCA for neuroscientists: Wang, Hao-Ting, et al. "Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists." NeuroImage (2020)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.