Git Product home page Git Product logo

scarches's Introduction

PyPI PyPIDownloads Docs travis

scArches - single-cell architecture surgery

scArches is a package to integrate newly produced single-cell datasets into integrated reference atlases. Our method can facilitate large collaborative projects with decentralized training and integration of multiple datasets by different groups.

Updates

  • (7.07.2022) We have added treeArches to scArches code base. treeArches enables building cell-type hirachies to idnetify novel states (e.g. disease, sobpopulations) in the query data when mapped to the reference. See tutorials here here .
  • (6.02.2022) We have added expiMap to scArches code base. expiMap alows interpretable reference mapping. See tutorials here here .

What can you do with scArches?

  • Construct single or multi-modal (CITE-seq) reference atlases and share the trained model and the data (if possible).
  • Download a pre-trained model for your atlas of interest, update it with new datasets and share with your collaborators.
  • Project and integrate query datasets on the top of a reference and use latent representation for downstream tasks, e.g.:diff testing, clustering, classification

What are the different models?

scArches is an algorithm to map to project query on the top of reference datasets and applies to different models. Here we provide a short explanation and hints on when to use which model. Our models are divided into three categories:

Unsupervised

This class of algorithms require no cell type labels, meaning that you can create a reference and project a query without having access to cell type labels. We implemented two algorithms:

  • scVI (Lopez et al., 2018): Requires access to raw counts values for data integration and assumes

count distribution on the data (NB, ZINB, Poisson).

  • trVAE (Lotfollahi et al.,2020): It supports both normalized log transformed or count data as input and applies additional MMD loss to have better merging in the latent space.
Supervised and Semi-supervised

This class of algorithms assumes the user has access to cell type labels when creating the reference data and usually perform better integration compared to. unsupervised methods. However, query data still can be unlabeled. In addition to integration, you can classify your query cells using these methods.

  • scANVI (Xu et al., 2019): It needs cell type labels for reference data. Your query data can be either unlabeled or labeled. In the case of unlabeled query data, you can use this method to also classify your query cells using reference labels.
  • scGen (Lotfollahi et al., 2019): This method requires cell-type labels for both reference building and query mapping. The query mapping for this method solely relies on the integrated reference and requre no fine-tuning.
Bioligically informed
  • expiMap (Lotfollahi, Rybakov et al., 2022): This method takes prior knowledge from gene sets databases or users allowing to analyze your query data in the context of known gene programs.
Multi-modal

These algorithms can be used to construct multi-modal references atlas and map query data from either modality on the top of the reference.

  • totalVI (Gayoso al., 2019): This model can be used to build multi-modal CITE-seq reference atalses. Query datasets can be either from sc-RNAseq or CITE-seq. In addition to integrating query with reference, one can use this model to impute the Proteins in the query datasets.

Usage and installation

See here for documentation and tutorials.

Support and contribute

If you have a question or new architecture or a model that could be integrated into our pipeline, you can post an issue or reach us by email.

Reference

If scArches is useful in your research, please consider citing following paper:

@article{lotfollahi2021mapping,
  title={Mapping single-cell data to reference atlases by transfer learning},
  author={Lotfollahi, Mohammad and Naghipourfar, Mohsen and Luecken, Malte D and Khajavi,
  Matin and B{\"u}ttner, Maren and Wagenstetter, Marco and Avsec, {\v{Z}}iga and Gayoso,
  Adam and Yosef, Nir and Interlandi, Marta and others},
  journal={Nature Biotechnology},
  pages={1--10},
  year={2021},
  publisher={Nature Publishing Group}}

scarches's People

Contributors

naghipourfar avatar m0hammadl avatar koncopd avatar cottoneyejoe95 avatar aidinbii avatar mohsennaghipourfar avatar matinkhajavi avatar lcmmichielsen avatar mbuttner avatar mohsennp avatar cuongqn avatar hrovatin avatar zethson avatar evanbiederstedt avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.