Git Product home page Git Product logo

mka's Introduction

Mouse Kidney Atlas

workflow

We present the Mouse Kidney Atlas (MKA), a comprehensive atlas of cellular heterogeneity in the healthy mouse kidney, which we generated by carefully integrating data from eight publicly available studies. We integrate these datasets using scVI and scANVI. To overcome annotation inconsistencies we learn the relationship between cell type transcriptomic profiles across datasets using scHPL. This model is then able to automatically label unseen cell populations with unprecedented resolution and accuracy. We demonstrate the significance of our atlas by obtaining robust and novel markers for poorly described cell types.

The MKA is publicly available to download, visualize and interact with at cellxgene

For more details refer to: A comprehensive mouse kidney atlas enables rare cell population characterization and robust marker discovery

File descriptions

  • models: Files containing the trained models used in the manuscript

  • notebooks: notebooks used to generate the figures presented in the manuscript

    • QC_scVI_scANVI : Figure 1

    • scHPL_ManualReannotation : Figure 2 and 3

      Supplementary Figures 1, 2 and 3

    • scHPL_Evaluation : Figure 4

      Supplementary Figure 4 and 5

    • Downstream_analyses : Figure 5

      Supplementary Figure 6

  • MKA_Metamarkers.xlsx Excel file with the identified metamarkers for each cell type label in the MKA.

    • Rank: Overall ranking for this gene within a cell type. The higher the ranking the better the marker is for the given population accounting for batch differences and number of datasets in which the gene is detected.
    • AUROC: Area under the receiver-operator curve. This value is an indication of how good the gene is in a classification scenario. For example, Podxl has an AUROC value of 0.9, which means that this gene is very good at classifying Podocytes as such.
  • functions.py helper functions used across the code

  • hyper_tune.py Ray tune implementation to optimize scVI model hyperparameters

Using the trained models

If you want to use the models for your own research you will need the HVG-filtered matrix we trained these on. You can find the AnnData object at Zenodo. Once downloaded, you can:

import os
import scvi
import scanpy as sc

os.chdir("MKA")
adata = sc.read_h5ad("adata.h5ad")
atlas_model = scvi.model.SCANVI.load("models/scANVI_model_full", adata=adata)

Hyperparameter Optimization

Ray tune was used train 1000 different hyperparameter and model configurations.

The tracked metrics at each training epoch were 'elbo_validation', 'reconstruction_loss' and 'silhouette_score'. Batch and cell type silhouette scores computed on the latent space were used as objective functions to maximize during training.

The search space was defined as follows:

  • model configuration
    • dropout rate: loguniform distribution between 1e-4 and 1e-1
    • number of layers: random integer between 1 and 3
    • number of latent dimensions: random integer between 20 and 31
  • plan configuration
    • learning rate: loguniform distribution between 1e-4 and 1e-1
  • atlas architecture
    • subset: random boolean (True/ False).

    The purpose of this parameter is to test the effect of filtering the feature space

    • number of hvgs: random choice between 2000 and 8000 in 1000 increments
    • continious_covariates: random choice between 'pct_counts_mt' and None
    • categorical_covariates: random choice between 'Source' and None

    'Source' in this case refers to either nuclei or cell as the starting material

  • number of epochs: random number between 100 and 201

Datasets

The following table contains all studies included in the MKA

Publication Abbbreviation Accession number
Wu et al., 2019 Wu19 GSE119531
Miao et al., 2021 Miao21 GSE157079
Park et al., 2018 Park18 GSE107585
Kirita et al., 2020 Kirita20 GSE139107
Dumas et al., 2020 Dumas20 E-MTAB-8145
Conway et al., 2020 Conway20 GSE140023
Hinze et al., 2021 Hinze21 GSE145690
Janosevic et al., 2021 Janosevic21 GSE151658

mka's People

Contributors

nrclaudio avatar

Stargazers

Katharina Waury avatar  avatar  avatar  avatar Peter Zeng avatar  avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.