Git Product home page Git Product logo

raslseqtools's Introduction

RASLseqTools

RASLseq FASTQ reads to RASLprobe counts.

RASL-seq is a powerful and inexpensive method to assess gene expression without the need for RNA isolation1. We recently published a modified protocol using the RNA ligase Rnl2 which demonstrated dramatically increased ligation efficiency2. This python package offers an alignment method leveraging BLASTn, pandas, py-editdist, and NumPy. Optimizations will follow in the near future.



20 August 2015


Public Release RASLseqTools Version 0.2

    Added Normalization Module

    Added Levenshtein Barcode Analysis functions to RASLseqBCannot


19 April 2015


Initial public release RASLseqTools, Default Aligner is now STAR (https://github.com/alexdobin/STAR)

Example IPython Notebook: http://nbviewer.ipython.org/github/erscott/RASLseqTools/blob/master/ipynb/RASLseqTools_STAR_example.ipynb

STAR Usage

python /path/to/RASLseqAnalysis_STAR.py [required args -f -a -p -w -d -o] [optional args -P -A -n -o5 -o3 -ws -we]

-f : str, absolute path to fastq file(s) (accepts gzip files, comma-separated list if multiple fastqs)

-a : str, absolute path to STAR bin directory

-p : str, absolute path to probes file

-w : str, absolute path to annotations file

-d : str, absolute path to output directory

-o : str, absolute path of output file


-P : bool, verbose printing

-A : bool, Write STAR alignments to disk, will be written in output directory

-n : int, number of jobs, currently requires 2 processors

-o5: int, number of bases to clip from 5-prime end of read to isolate probe sequence, default=24

-o3: int, number of bases to clip from 3-prime end of read to isolate probe sequence, default=22

-ws: int, index position of the wellbarcode start base in read, default=0

-we: int, index position of the wellbarcode end base in read, default=8


example command:

python /path/to/RASLseqAnalysis_STAR.py -f /path/to/your.fastq.gz -a /path/to/STAR_binary/ -p /paht/to/RASL.probes -w /path/to/annotations.bc -d /path/to/write_directory/ -o /path/to/blastdb/write_file.txt -P -A -n 1 -o5 25 -03 20 -ws 0 -we 8

example data: can be found in the data directory


BLASTn Usage

python /path/to/RASLseqAnalysis_BLAST.py [required args -f -s -p -w -d -b -o] [optional args -P]

-f : absolute path to fastq file (accepts gzip files)

-p : absolute path to probes file

-w : absolute path to annotations file

-d : absolute path to write directory for blast database

-b : absolute path to blast bin directory

-o : absolute path of output file

-s : specifies sequencer id in fastq index line, e.g. @HISEQ


-P : verbose printing


example command:
python /path/to/RASLseqAnalysis_BLAST.py -f /path/to/your.fastq.gz -s @HISEQ -p /paht/to/RASL.probes -w /path/to/annotations.bc -d /path/to/blastdb/write_dir/ -b /path/to/blast/ncbi-blast-2.2.26+/bin/ -P -o /path/to/output.txt

example data: can be found in the data directory

NOTE:RASLseqAnalysis_NAR.py is provided for transparency and requires manual parameter settings to run


Input File Formats

FASTQ: standard FASTQ format (optionally gzipped)

X.probes: tab-separated file describing the RASLseq Probes with the following columns and column headers

AcceptorProbeSequence
DonorProbeSequence
AcceptorAdaptorSequence
DonorAdaptorSequence
ProbeName

Please see example file in data/ directory

X.bc: tab-separated file describing each well in the experiment with the following columns and column headers
REQUIRED:
PlateBarcode
WellBarcode
OPTIONAL: additional columns with well metadata, column headers are user defined, e.g. drug_concentration
Please see example file in data/ directory


Dependencies

    STAR aligner
    BLASTn
    pandas
    Levenshtein editdist
    NumPy


References

  1. H. Li, J. Qiu, X.-D. Fu, RASL-seq for massive parallel and quantitative analysis of gene expression, Curr. Protocol. Mol. Biol., 98 (2012), pp. 4.13.1โ€“4.13.9

  2. Larman HB, Scott ER, Wogan M, Oliveira G, Torkamani A, Schultz PG, Sensitive, multiplex and direct quantification of RNA sequences using a modified RASL assay, Nucleic Acids Res. 2014;42(14):9146-57

raslseqtools's People

Contributors

erscott avatar

Stargazers

Liping Hou avatar  avatar Ben avatar  avatar Max Nanis avatar Mike Panciera avatar

Watchers

 avatar Ben avatar  avatar

raslseqtools's Issues

Describe the three output files

Describe the three output files in detail in the README, possibly offer the user the choice to opt-out of writing them to disk

how to modify for cases without plate barcodes?

I'm attempting to modify your tool to accommodate other designs with respect to barcodes, etc.

The one case I'm dealing with at the moment are data without a plate barcode. It's only 4 plates worth of data. There's a 7bp barcode that specifies both plate and well therein.

I am hoping that you could suggest where in RASLseqTools I should handle the lack of a plate barcode.

As of now, I'm reading through RASLseqBCannot.py to see if there's a good handle there to deal with the lack of plate barcodes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.