Hi Mikkel, I have used your pipeline in my Master's thesis to trim a

Should the .rmdup.collapsed.bam and .rmdup.normal.bam be merged? about paleomix HOT 1 CLOSED

mikkelschubert commented on July 29, 2024

Should the .rmdup.collapsed.bam and .rmdup.normal.bam be merged?

from paleomix.

Comments (1)

MikkelSchubert commented on July 29, 2024

Hi Vitali,

The two files you mention (.rmdup.collapsed.bam and .rmdup.normal.bam) are intermediate files generated by the pipeline, prior to it filtering PCR duplicates. You should not be using these files without good reason. The BAM file you should be using is the ${Sample}.${Genome}.bam file located in the root of your output directory (the same directory as the YAML file by default). That file will have been appropriately filtered for PCR duplicates in a unless you specifically turned it off.

See here for a detailed description of the output files of the pipeline:
https://paleomix.readthedocs.io/en/stable/bam_pipeline/filestructure.html

I would generally recommend merging PE reads, regardless of read quality. The only exception is if the DNA fragment size is much greater than about 300 bp (ie. twice your read length), since false positive merged reads will likely dominate in that case.

Best regards,
Mikkel

from paleomix.

Should the .rmdup.collapsed.bam and .rmdup.normal.bam be merged? about paleomix HOT 1 CLOSED

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent