Git Product home page Git Product logo

gjsrmap's Introduction

GJSrMap: The smallRNA mapping pipeline

Overview:

An open source and fully customized pipeline that:

  • Maps all small non-coding RNAs to customized and comprehensive reference sequences
  • Is modularized and iterative
  • Run on an HPCC cluster (default) or on a computer/server
  • Performs quality control on the data
  • Provides a detailed summary of mapping:
    • FastQC plots for every iteration
    • MultiQC plots for every iteration
    • Summary plots and statistics of smallRNA distribution and abundance
  • Provides raw and normalized (RPKM) counts
  • Detailed logs of every iteration and steps

Description

  1. gjsrmap: SmallRNA mapping and analysis pipeline schematic:

    fig_gjsrmap_overview

    • Pre-processing of the sequences:

      • This is the iteration 0 in the above schematic diagram
      • Preprocessing of the sequences to avoid multi mapping of the reads
      • Build custom reference sequence indexes
      • Removal of low quality reads
      • Removal of 3' adapter sequences and size reduction of the reads
    • Iterative mapping of processed and filtered reads:

      • Iteration 1: Map reads between 16 to 33 bp to custom reference sequences of mature microRNAs and piRNAs

      • Iteration 2: Map reads greater than 32 bp to custom reference sequences of other small non-coding RNAs. These are:

        | Other small non-coding RNAs | Description                                          |
        |-----------------------------|------------------------------------------------------|
        | rRNA                        | Ribosomal RNA                                        |
        | scRNA                       | Small cytoplasmic RNA                                |
        | snRNA                       | Small nuclear RNA                                    |
        | snoRNA                      | Small nucleolar RNA                                  |
        | premiRNA                    | microRNA precursors                                  |
        | osncRNA                     | Other small noncoding RNA                            |
        | - tRNA                      | - Transfer RNA                                       |
        | - Mt-tRNA                   | - Transfer RNA located   in the mitochondrial genome |
        | - misc_RNA                  | - Miscellaneous other RNA                            |
        
        
      • Iteration 3: Map the unmapped reads from iteration 1 and 2 to the species reference genome

    • Count the reads and distribute them to individual smallRNA classes

    • Generate QC, mapping and summary report

      fig_qc

      • Sequence quality information
      • Bar plot of library sizes
      • Small non-coding RNA reads distrubution
      • Profile of expressed small non-coding RNAs (miRNAs in the above figure). Plots are also generated for other classes as well

Dependencies:

  • bedtools
  • bowtie
  • bowtie2
  • cutadapt
  • fastqc
  • matplotlib
  • multiqc
  • numpy
  • samtools
  • scipy

Usage:

  • Run wrapper with following options:
SPC=${1}                                            # Species: hsa or mmu or some other species
IFD=${2}                                            # Input Fastq Dir: input/fastq/test
ORD=${3}                                            # Output Results Dir: output/test
BWD=${4}                                            # path/to/bowtie/indexes
QUE=${5:-"fat"}                                     # mpi, fat, mpi-short, fat-short, mpi-long, fat-long
SPK=${6:-""}                                        # exiseq_spikein_dna_unique.fa or spike_rna1_unique.fa
threePadapter=${7:-"TGGAATTCTCGGGTGCCAAGG"}         # trueseq adapter
JID=${8:-"$(echo $HOME)/gjsrmap"}                   # Job dir
NCL=${9:-"input/annotation/rna_classes"}            # ncrna folder containing ncrna class fasta
  • Example command:
    bash 06_run_ncRNA_mapping_usage.sh <above mentioned arguments>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.