Attempting to run through the pipeline, I get twarted at the initial step. The output

Illumina Processing results in zero sequences about amptk HOT 4 CLOSED

nextgenusfs commented on September 23, 2024

Illumina Processing results in zero sequences

from amptk.

Comments (4)

nextgenusfs commented on September 23, 2024

These are the only two files in the 'processed' folder? Based on the logfile the script appears to be merging paired ends reads correctly, although if this is 250 bp reads then you should change one of the settings. In that processed folder do you then see a single file for each of your samples, i.e. V4.fastq?

What primers did you use for amplification and sequencing? The default settings are the ITS2 region using fITS7 and ITS4 primers. You can specify different primers using the -f and -r options. The default settings also assume that you used the Illumina TruSeq dual barcoding approach, where your reads look like this from the sequencing center and that the primers are intact:
5'primer-read-3'primer
So the script will only output reads where it can find the forward primer. If that is not your read structure, i.e. your primers are already removed then you need to pass the --require_primer off option at runtime.

You should also set the --read_length 250 if you have PE 250 bp reads.

from amptk.

MycoMap commented on September 23, 2024

I do see a file for each of the samples. The issue may be this dataset only looks at ITS1.

What would the remainder of the script be for setting new forward and reverse primers?

from amptk.

nextgenusfs commented on September 23, 2024

So you would do something like this if it was ITS1-F and ITS2 primers - note you should add the actual primer sequences that you used that will remain after Illumina trims off their adapters and index sequences (typically this is just the normal primer). If you used the custom sequencing primers that are used in the community, i.e. from Smith et al. 2014 - then you need to pass the --require_primer off option. The --rescue_forward option will keep the forward reads if the paired reads cannot be merged.

ufits illumina -i rawdata -o process_ITS1 -f CTTGGTCATTTAGAGGAAGTAA \
-r GCTGCGTTCTTCATCGATGC --read_length 250 --rescue_forward

Remember that running any of the commands in UFITS without any options will output a help menu:

ufits illumina

Usage:       ufits illumina <arguments>
version:     0.5.5

Description: Script takes a folder of Illumina MiSeq data that is already de-multiplexed and processes it for
             clustering using UFITS.  The default behavior is to: 1) merge the PE reads using USEARCH, 2) find and
             trim away primers, 3) rename reads according to sample name, 4) trim/pad reads to a set length.

Arguments:   -i, --fastq         Input folder of FASTQ files (Required)
             -o, --out           Output folder name. Default: ufits-data
             --reads             Paired-end or forward reads. Default: paired [paired, forward]
             --read_length       Illumina Read length (250 if 2 x 250 bp run). Default: 300 
             --rescue_forward    Rescue Forward Reads if PE do not merge, e.g. abnormally long amplicons
             -f, --fwd_primer    Forward primer sequence. Default: fITS7
             -r, --rev_primer    Reverse primer sequence Default: ITS4
             --require_primer    Require the Forward primer to be present. Default: on [on, off]
             -n, --name_prefix   Prefix for re-naming reads. Default: R_
             -m, --min_len       Minimum length read to keep. Default: 50
             -l, --trim_len      Length to trim/pad reads. Default: 250
             --full_length       Keep only full length sequences.
             --cpus              Number of CPUs to use. Default: all
             -u, --usearch       USEARCH executable. Default: usearch8
             --cleanup           Remove intermediate files.

from amptk.

MycoMap commented on September 23, 2024

Thank you very much. I think this should get me to where I need to be.

from amptk.

Illumina Processing results in zero sequences about amptk HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent