Git Product home page Git Product logo

asap_to_kite's Introduction

ASAP to kite

A script to process fastqs from ASAP-seq for downstream processing with kite (kallisto | bustools).

The options are designed to mirror that of CellRanger/CellRanger-ATAC for convenience in processing.

About

There are two scripts in this repository. First, asap_to_kite_vX.py (where X is an integer) represents the tool most commonly used for the simple reformatting. This will work most of the time.

In rare cases, you may have stained with both TSA and TSB antibodies. In this case, we have the asap_to_kite_A_B.py to separate out the two using the different PCR handles. For this to work, we do require the pre-identification of the valid barcodes for both TSA and TSB; hence, additional parameters in the -a and -b flags.

Sample use cases

One sample, one directory

The most basic use case is when we have one library sequenced one. From the demultiplexing, we should see files that look like this:

test/data1/test1_S1_L001_R1_001.fastq.gz
test/data1/test1_S1_L001_R2_001.fastq.gz
test/data1/test1_S1_L001_R3_001.fastq.gz
test/data1/test1_S1_L002_R1_001.fastq.gz
test/data1/test1_S1_L002_R2_001.fastq.gz
test/data1/test1_S1_L002_R3_001.fastq.gz
test/data1/test1_S1_L003_R1_001.fastq.gz
test/data1/test1_S1_L003_R2_001.fastq.gz
test/data1/test1_S1_L003_R3_001.fastq.gz
test/data1/test1_S1_L004_R1_001.fastq.gz
test/data1/test1_S1_L004_R2_001.fastq.gz
test/data1/test1_S1_L004_R3_001.fastq.gz

Here, the sequencing run is in the folder test/data1 and we are interested in the test1 sample.

We can process these fastqs:

python asap_to_kite_v1.py -f test/data1 -s test1 -o one_one

Here, the -s specifies the sample name; -f specifies the fastq folder; -o specifies the output naming convention.

One sample, multiple directories

If multiple sequencing rounds are performed, we can supply all sequencing libraries as a comma-separated list:

python asap_to_kite_v1.py -f test/data1,test/data2 -s test1 -o one_many

One sample, named multiple ways, in multiple directories

Suppose that the sequencing library is named two different ways over the two sequencing runs. We can stack the comma-separated nature of the sample names and the sequencing runs to synthesize the libraries

python asap_to_kite_v1.py -f test/data1,test/data2 -s test1,test2 -o many_many

Write to a different output destination

Finally, just to showcase that we can write these files out to a different path:

python asap_to_kite_v1.py -f test/data1,test/data2 -s test1,test2 -o test/many_many

Important

This code works for one biological sample at a time. If multiple samples are supplied in the command line execution, then they will be merged (under the assumption that they were called different things). Execute the code sequentially for each sample in the event of multiple biological samples.

Options

python asap_to_kite_v1.py --help

yields

Usage: asap_to_kite_v1.py [options] [inputs] Script to reformat raw sequencing 
data from CellRanger-ATAC demultiplexing to a format 
compatible with kite (kallisto|bustools)

Options:
  -h, --help            show this help message and exit
  -f FASTQS, --fastqs=FASTQS
                        Path of folder created by mkfastq or bcl2fastq; can be
                        comma separated that will be collapsed into one
                        output.
  -s SAMPLE, --sample=SAMPLE
                        Prefix of the filenames of FASTQs to select; can be
                        comma separated that will be collapsed into one output
  -o ID, --id=ID        A unique run id, used to name output.
  -c CORES, --cores=CORES
                        Number of cores for parallel processing. Default = 4.
  -n NREADS, --nreads=NREADS
                        Maximum number of reads to process in one iteration.
                        Decrease this if in a low memory environment (e.g.
                        laptop). Default = 10,000,000.
  -r, --no-rc-R2        By default, the reverse complement of R2 (barcode) is
                        performed (when sequencing with, for example, the
                        NextSeq). Throw this flag to keep R2 as is-- no
                        reverse complement (rc).



asap_to_kite's People

Contributors

caleblareau avatar liviuspenter avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.