Git Product home page Git Product logo

scran.js's Introduction

Single cell RNA-seq analysis in Javascript

Overview

This repository contains scran.js, a Javascript library for single-cell RNA-seq (scRNA-seq) analysis in the browser. The various calculations are performed directly by the client, allowing us to take advantage of the ubiquity of the browser as a standalone analysis platform. Users can then directly analyze their data without needing to manage any dependencies or pay for access to a backend. scran.js is heavily inspired by the scran R package and contains most of its related methods. Indeed, much of the implementation in this repository is taken directly from scran and its related R packages.

Key scRNA-seq analysis steps

Currently, the library and web app supports the key steps in a typical scRNA-seq analysis:

  • Quality control (QC) to remove low-quality cells. This is done based on detection of outliers on QC metrics like the number of detected genes.
  • Normalization and log-transformation, to remove biases and mitigate the mean-variance trend. We use scaling normalization with size factors defined from the library size for each cell.
  • Feature selection to identify highly variable genes. This is based on residuals from a trend fitted to the means and variances of the log-normalized data for each gene.
  • Principal components analysis (PCA) on the highly variable genes, to compress and denoise the data. We use an approximate method to quickly obtain the top few PCs.
  • Clustering using multi-level community detection (a.k.a., "Louvain clustering"). This is performed on the top PCs.
  • Dimensionality reduction with t-stochastic neighbor embedding (t-SNE), again using the top PCs.
  • Marker detection using a variety of effect sizes such as Cohen's d and the area under the curve (AUC).

Coming soon:

  • Clustering using k-means.
  • Dimensionality reduction by uniform map and approximate projection (UMAP).
  • Batch correction via the mutual nearest neighbors method.

The theory behind these methods is described in more detail in the Orchestrating Single Cell Analysis with Bioconductor book.

Efficient analysis with WebAssembly

We use WebAssembly (Wasm) to enable efficient client-side execution of common steps in a scRNA-seq analysis. Code to perform each step is written in C++ and compiled to Wasm using the Emscripten toolchain. Some of the relevant C++ libraries are listed below:

  • libscran provides C++ implementations of key functions in scran and its fellow packages scater and scuttle. This includes quality control, normalization, feature selection, PCA, clustering and dimensionality reduction.
  • tatami provides an abstract interface to different matrix classes, focusing on row and column extraction.
  • knncolle wraps a number of nearest neighbor detection methods in a consistent interface.
  • CppIrlba contains a C++ port of the IRLBA algorithm for approximate PCA.
  • CppKmeans contains C++ ports of the Hartigan-Wong and Lloyd algorithms for k-means clustering.
  • qdtsne contains a refactored C++ implementation of the Barnes-Hut t-SNE dimensionality reduction algorithm.
  • umappp contains a refactored C++ implementation of the UMAP dimensionality reduction algorithm.

For each step, we use Emscripten to compile the associated C++ functions into Wasm and generate Javascript-visible bindings. We can then load the Wasm binary into a web application and call the desired functions on user-supplied data.

Building the Wasm binary

This directory contains the files required to create the scran.js Wasm binary. We use CMake to manage the compilation process as well as the dependencies, namely the scran C++ library. Compilation of the Wasm binary is done using Emscripten:

emcmake cmake -S . -B build
(cd build && emmake make)

This will build the .js and .wasm file within the build/ subdirectory.

scran.js's People

Contributors

ltla avatar jkanche avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.