Comments (10)
I am glad to hear that you've find AdapterRemoval to be useful.
Unfortunately it is currently not possible to only trim the 3' of reads using the --trimqualities option. But it shouldn't be much trouble to add an option for that, so I'll include it in the next update to AdapterRemoval.
Best,
Mikkel
from adapterremoval.
Hi all,
I've just released AdapterRemoval 2.3.1 which adds a new option (--preserve5p) that prevents quality based trimming at the 5p termini when any of the --trimns, --trimqualities, or --trimwindows options are used. This also entirely disables quality based trimming of collapsed reads, since both ends of these are informative for PCR duplicate filtering (see [1] and [2] for scripts that can be used for this).
Thank you for your patience and feel free to re-open this issue or open a new issue if you run into any (related) problems.
Best,
Mikkel
[1] FilterUniqueBAM.py/FilterUniqueSAMCons.py from https://www.ncbi.nlm.nih.gov/pubmed/22237537
[2] paleomix rmdup_collapsed from https://www.ncbi.nlm.nih.gov/pubmed/24722405
from adapterremoval.
It turns out that trimming the base at the 5' end has serious consequences for low coverage genomes because duplicate removal will no longer recognise PCR duplicates.
from adapterremoval.
I am glad to hear that you've find AdapterRemoval to be useful.
Unfortunately it is currently not possible to only trim the 3' of reads using the --trimqualities option. But it shouldn't be much trouble to add an option for that, so I'll include it in the next update to AdapterRemoval.Best,
Mikkel
An add-on to the --trim5p
--trim3p
option ?
from adapterremoval.
Hi!
We just noticed that when using the option --preserve5p, it is still trimming merged reads in the 3' which shouldn't happen because it was originally a 5', is there any further setting to prevent this from happening?
from adapterremoval.
from adapterremoval.
Sorry for the late reply.
Here is command-line settings:
AdapterRemoval --file1 sample1.fq.gz --basename sample1.fq --gzip --threads 4 --trimns --trimqualities --preserve5p --adapter1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC --adapter2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA --minlength 30 --minquality 20 --minadapteroverlap 1
An example of trimmed read (the last T is trimmed):
After AdapterRemoval with the --minquality 0:
@M_NS500382:27:HJH3LBGXX:2:13211:2150:9711 1:N:0:GTACTCGA+AACCTCAG
CACGGTATCGGCCGCAACGTTTTCAGCACGTGTTGGGTCAGAAGTTTGTAGTGGCAACACTGTAAAAATCTCTTGAGGAGT
+
AAAAAAEEEEAEEEEEAEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEAEEEAEEEAEEEEE/EEEEEEEAEEEAAAAA/
After AdapterRemoval with the command above:
@M_NS500382:27:HJH3LBGXX:2:13211:2150:9711 1:N:0:GTACTCGA+AACCTCAG
CACGGTATCGGCCGCAACGTTTTCAGCACGTGTTGGGTCAGAAGTTTGTAGTGGCAACACTGTAAAAATCTCTTGAGGAG
+
AAAAAAEEEEAEEEEEAEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEAEEEAEEEAEEEEE/EEEEEEEAEEEAAAAA
I want to mention that this data had been already trimmed and merged before running it again through AdapterRemoval. Maybe that would explain why the trimming is happening?
from adapterremoval.
from adapterremoval.
Dear Mikkel,
I am sorry for taking so long to reply. My colleague was doing some reprocessing of the data and to keep it consistent he re-run all the steps as he did previously, without realising that the reads were already trimmed and merged.
I understand that if you rerun your data some trimming may happen because of resemblance to the adapters. However, in the data of the example I was talking about, we've got reads that before rerunning AdapterRemoval are duplicates, with the same sequence but only differing in the quality of the first base in the 5'. After we run AdapterRemoval, one of the reads got trimmed (the example above) while the other didn't. So I think that if that trimming was due to adapter looking like base, it should have trimmed both reads the one with low quality in the 5' and the one with a good quality in the 5'. Is that correct or is there another behaviour that I am not having in account that could explain this phenomenon?
Thank you for your time!
Best,
Aida
from adapterremoval.
Dear Aida,
You are right that the T does not match the adapter sequence. Looking closer, the cause of base being trimmed is the "--minquality 20" option. The T has a Phred encoded quality score of "/", which corresponds to a quality of 15, and because AdapterRemoval is treating the reads as SE data, the --preserve5p option does not stop it from trimming that base.
Best,
Mikkel
from adapterremoval.
Related Issues (20)
- --combined-output generates strange GC% distribution in fastqc reports HOT 4
- stdout and stderr reversed? HOT 1
- homebrew/science was deprecated HOT 1
- Adapter Removal running error HOT 2
- Internal barcodes HOT 5
- Support JSON log output HOT 3
- Add option for singular 'combined' output FASTQ files HOT 5
- Input file is overwritten and cut off HOT 3
- Adapters Not Being Trimmed (Apparently)? HOT 2
- collapse with SE reads HOT 3
- How to demultiplexing paired-end reads in mixed orientation? HOT 4
- installation Adapterremoval with Conda HOT 2
- AdapterRemoval v3 Feedback HOT 12
- remove adapters sequences from BGI/MGI platform HOT 1
- combine with other QC tools HOT 1
- How to interpret the outputs of the multiple inputs after trimming? HOT 4
- New version of catch.hpp needed for glibc >= 2.34 HOT 1
- [feature request] Add support for GNU make's staged install via DESTDIR flag HOT 1
- support .gz format fastq as input HOT 1
- adapterremoval add windows version HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from adapterremoval.