Pipeline for metagenomics sequencing processing: This pipeline includes pre-processing steps using KneadData and two options for running taxonomic classification for shotgun metagenomics sequencing including MetaPhlan4 and Kraken2. The steps of the pipeline are described below, with detailed documentation available here.
-
KneadData (01_build_kneaddata_db.sh -> 01_run_kneaddata.sh)
- Trim adapters
- Trim repetitive sequences
- Remove host (human) DNA
-
Kraken2
- Build custom database of k-mer minimizer sequences (02_kraken2-build.sh)
- Taxonomic classification for each sample (02_kraken2-classification.sh)
- Generate BIOM table of merged Kraken2 report outputs (02_kraken2-generate-biom.sh)
- Generate table of k-mer minimizer counts (02_kraken2-generate-table.R)
-
Bracken
- Build k-mer distribution file (03_bracken-build.sh)
- Taxonomic relative abundance estimation for each sample (03_bracken-abundance-estimation.sh)
- Filter Bracken report outputs for each sample (03_bracken-filter-report.sh)
- Generate table of merged Bracken report outpus (03_bracken-combined-report.sh)
-
MetaPhlan4
- Taxonomic classification for each sample (04_run_mp4.sh)