Git Product home page Git Product logo

njab's Introduction

(not) Just Another Biomarker (nJAB)

njab is a collection of some python function building on top of pandas, scikit-learn, statsmodels, pingoin, numpy and more...

It aims to formalize a procedure for biomarker discovery which was first developed for a paper on alcohol-related liver disease, based on mass spectrometry-based proteomics measurements of blood plasma samples:

Niu, L., Thiele, M., Geyer, P. E., Rasmussen, D. N., Webel, H. E.,
Santos, A., Gupta, R., Meier, F., Strauss, M., Kjaergaard, M., Lindvig,
K., Jacobsen, S., Rasmussen, S., Hansen, T., Krag, A., & Mann, M. (2022).
“Noninvasive Proteomic Biomarkers for Alcohol-Related Liver Disease.”
Nature Medicine 28 (6): 1277–87.
nature.com/articles/s41591-022-01850-y

The approach was formalized for an analysis of inflammation markers of a cohort of patients with alcohol related cirrhosis, based on OLink-based proteomics measurments of blood plasma samples:

Mynster Kronborg, T., Webel, H., O’Connell, M. B., Danielsen, K. V., Hobolth, L., Møller, S., Jensen, R. T., Bendtsen, F., Hansen, T., Rasmussen, S., Juel, H. B., & Kimer, N. (2023).
Markers of inflammation predict survival in newly diagnosed cirrhosis: a prospective registry study.
Scientific Reports, 13(1), 1–11.
nature.com/articles/s41598-023-47384-2

Installation

Install using pip from PyPi version.

pip install njab

or directly from github

pip install git+https://github.com/RasmussenLab/njab.git

Tutorials

The tutorial can be found on the documentation of the project with output or can be run directly in colab.

Explorative Analysis of survival dataset

open in Colab

The tutorial builds on a dataset example of survival of prostatic cancer.

The main steps in the tutorial are:

  1. Data loading and inspection
  2. Uncontrolled binary and t-tests for binary and continous variables respectively
  3. ANCOVA analysis controlling for age and weight, corrected for multiple testing
  4. Kaplan-Meier plots of for significant features

Biomarker discovery tutrial

open in Colab

All steps are describe in the tutorial, where you could load your own data with minor adaptions. The tutorial build on an curated Alzheimer dataset from omiclearn. See the Alzheimer Data section for more information.

The main steps in the tutorial are:

  1. Load and prepare data for machine learning
  2. Find a good set of features using cross validation
  3. Evaluate and inspect your model retrained on the entire training data

Documentation

Please find the documentation under njab.readthedocs.io

njab's People

Contributors

enryh avatar

Watchers

 avatar

Forkers

ftnext

njab's Issues

Add functionality to get ROC and PRC with confidence intervals

Improve reporting of receiving operating curves (ROC) and precision-recall curves (PRC). Each runs result should be save to a dataframe for inspection and calculation of statistics. This then also enables ROC and PRC curves with confidence bands.

  • Check how this was done in ALD study
  • Provide bootstrapped confidence interval as done by seaborn?

Make is possible to specify desired feature set, start at njab.sklearn.find_n_best_features

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.