Comments (6)
Maybe something like the attached patch will do the trick (it needs testing)...
from paleomix.
Oops, try this one instead.
from paleomix.
Hey Graham,
Thank you very much for the report and for the patches!
While I see the problem, I am not sure that treating clipped bases as part of the alignment (which they are not, by definition) is something that you can rely on. The possibility of clipped reads extending past the contig termini also poses some questions, since the clipped bases could, potentially, map to different contigs (or not at all).
I have investigated this briefly, and between Picard MarkDuplicates and SAMTools rmdup, Picard appears to follow your strategy while SAMTools does not. I intend to look into this further, as time permits, and potentially add your patch in an upcoming update.
Best,
Mikkel
from paleomix.
Hey Graham,
I apologize for taking so long, but unfortunately I have been busy wrapping up several projects these last few months.
I have been working on a updated version of the script, inspired by your patch, which I intend to finalize by next week. The current version is attached, if you want to have a chance to look at it now.
Best,
Mikkel
from paleomix.
Hi Mikkel,
I've looked over your new code and it looks right. I've not tested it much though. Do you have a test framework that you use for such things?
-G
from paleomix.
Hi Graham,
Unfortunately I do not yet have a framework for automatically testing that, though it is something I am interested in implementing. So I have manually been carrying out tests on various datasets, big and small, to ensure that the new version of rmdup_collapsed performs as expected. Long story short, I have have now released a version of PALEOMIX (v1.2.9) that includes the improved script, which should address this issue.
Once again thank you for reporting this issue, and apologies for taking so long. Do not hesitate to open additional issues if you should run into other problems with PALEOMIX.
Cheers
from paleomix.
Related Issues (20)
- paleomix bam_pipeline: error: unrecognized arguments: --gatk_max_threads=1 --progress_ui=running --jre_options= HOT 5
- ImportError: libhts.so.2: cannot open shared object file: No such file or directory HOT 14
- checkpointing HOT 2
- About the MinQuality setting HOT 6
- BOWTIE2 errors in the pipeline HOT 8
- A problem of PALEOMIX 2.0.0-alpha documentation HOT 2
- Should the .rmdup.collapsed.bam and .rmdup.normal.bam be merged? HOT 1
- biobambam2 instead of picard-tools? HOT 10
- Paleomix can not find picard even though it is there HOT 8
- Issue When Employing `RegionsOfInterest`. HOT 6
- similar bams with rescale and non rescale HOT 2
- Phylo pipeline "unknown command" HOT 2
- BWA backtrack additional options added to samse, not aln HOT 3
- conda environment perpetually solving during installation HOT 6
- Errors running node HOT 2
- Error with trimming SE adapters from sample HOT 2
- Receieved a NodeError while running the pipeline HOT 1
- BWA terminated by SIGKILL, PALEOMIX in BAM pipline HOT 2
- Duplicated reads error HOT 3
- Paleomix output reads folder no fastq reads HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from paleomix.