Git Product home page Git Product logo

ecophys-ecoviz's Introduction

EcoViz Ecophysiology Seal Sleep Classifier

Folder structure

├── 00_Data_Exploration.ipynb
├── DESCRIPTION
├── README.md
├── data
│   └── README.md
├── figs
│   ├── README.md
│   ├── light-sleep-spectral-power.png
│   ├── rem.png
│   ├── slow-wave-1.png
│   ├── slow-wave-2-spectral-power.png
│   └── slow-wave-2.png
├── notebooks
│   ├── 00_initial_feature_extraction.ipynb
│   ├── 01_initial_models.ipynb
│   ├── 01a_YASA_feature_extraction.ipynb
│   ├── 02_advanced_feature_generation_and_models.ipynb
│   ├── 03_recursive_feature_elimination.ipynb
│   ├── 04_lgbm_feature_viz.ipynb
│   ├── Basic GBTM Example.ipynb
│   ├── Boosted Decision Tree Experiments.ipynb
│   └── README.md
├── output
│   └── README.md
├── paper
│   └── README.md
├── pipeline
│   ├── 00_download_data.R
│   ├── 00_download_data.py
│   ├── _run_pipeline.R
│   └── _run_pipeline.py
├── requirements.txt
├── src
│   ├── Python
│   │   ├── README.md
│   │   ├── feature_extraction.py
│   │   └── feature_generation.py
│   └── R
│       └── README.md
└── traintest
    ├── test
    │   └── README.md
    └── train
        ├── README.md
        └── welch_feature_importance.csv
  • data/ - raw data only
  • figs/ - static figures generated during pipeline
  • notebooks/ - literate programming files (e.g., Jupyter, Quarto) explaining the pipeline
  • output/ - derived data generated during pipeline
  • paper/ - the analysis manuscript (preferably in .md format)
  • pipeline/ - pipeline scripts
  • src/ - source files supporting the pipeline (e.g., functions, classes, constants)
  • traintest/ - specifies train/test splits for reproducibility

Computational environment

├── DESCRIPTION
└── requirements.txt

These are conventional files used for recording package dependencies.

DESCRIPTION is a file used by R packages to capture package metadata. It is useful for analyses because it records package dependencies following commonly used conventions. Add packages (with version numbers) to the Imports field. When other users clone your repo, they can install dependencies by calling devtools::install_deps() in R.

requirements.txt is a file used to record Python package dependencies. Users can install dependencies by running pip install -r requirements.txt at the command line.

Pipeline steps

Here's how the pipeline flows.

00_initial_feature_extraction.ipynb

Initial notebook used for feature extraction, not the prettiest of notebooks, but walks through heart rate extraction from ECG using peak detection. Also goes through each of the features and plots their distributions to get a first-pass glimpse at the feature space.

01_initial_features_and_models.ipynb

Uses the features created in 00_initial_feature_extraction to create a few rudimentary scikit-learn machine learning models, including a Support Vector Machine Classifier (SVC), K Nearest Neighbors Classifier (KNN), and Random Forest Classifier (RFC). Also performs grid searching on each of these estimators to find ideal or close-to-ideal parameterizations of these models.

01a_YASA_feature_extraction.ipynb

Has a simple demonstration of applying the YASA sleep staging algorithm built for humans onto one seal. While the performance is not the best, many of the features implemented by YASA were used for this project, and in some cases their code was used as well (but adjusted for seals). Note that YASA requires a LOT of memory and the kernel often crashes for me if I have other notebooks running or too many Google Chrome tabs open.

02_advanced_feature_generation_and_models.ipynb

Uses the (semi) finalized feature extraction code in feature_extraction.py and feature_generation.py to generate all the available features for sleep detection created so far. Also creates a RFC using the grid-searched parameters from 01_initial_features_and_models, and plots the predictions against the true labels and features for a few naps

03_recursive_feature_elimination.ipynb

Performs recursive feature elimination using the model from 02_advanced_feature_generation_and_models.ipynb to explore which features are the most informative for seal sleep prediction.

04_lgbm_feature_viz.ipynb

Separating EEG bandpower features into smaller bands (from delta to 0-0.5 Hz, 0.5-1Hz, 1-2Hz, etc.) to Using gradient boosted decision tree lightGBM. Output is a CSV with WELCH _ Resolution (length in seconds of window) _ Name of feature (if two numbers, range of frequencies; if a name, type of feature).

Feature extraction usage

To run the full feature extraction start to finish, run from the terminal: python feature_generation.py <Input_EDF_Filepath> <Output_CSV_Filepath> [Config_Filepath]. The Config_Filepath should be a path to a json file that has key value pairs like the ones in the DEFAULT_CONFIG variable at the top of the feature_generation.py script. These parameters can be used to adjust window size, step size, and other parameters used by some of the feature functions (although some of the parameters have not been tested thoroughly so adjust them at your own risk). Any config keys not defined by the config file will use the default values defined at the top of feature_generation.py

Model usage

Something here about the models we tried and what went well

Use case interpretation

├── notebooks
└── paper

The literate programming documents in notebooks/ explain the pipeline components. They should render relatively quickly, so keep long-running commands in the pipeline scripts. Keep rendering times short by using the processed data in outputs/.

Rendering the documents in notebooks/ should be automated by the last pipeline script (e.g., pipeline/10_render_notebooks.R). For Jupyter notebooks, render them to HTML using nbconvert: jupyter nbconvert --to html notebooks/your_notebook.ipynb. For Quarto documents, use quarto render notebooks/your_notebook.qmd --to html.

paper/ contains a manuscript describing your use case. It should preferably be in Markdown or a literate programming script.

ecophys-ecoviz's People

Contributors

flukeandfeather avatar integerard avatar jmkendallbar avatar michaelsorenson avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.