Git Product home page Git Product logo

Comments (6)

grahamgower avatar grahamgower commented on June 27, 2024

Maybe something like the attached patch will do the trick (it needs testing)...

softclip-dedup.patch.txt

from paleomix.

grahamgower avatar grahamgower commented on June 27, 2024

Oops, try this one instead.

softclip-dedup.patch2.txt

from paleomix.

MikkelSchubert avatar MikkelSchubert commented on June 27, 2024

Hey Graham,

Thank you very much for the report and for the patches!

While I see the problem, I am not sure that treating clipped bases as part of the alignment (which they are not, by definition) is something that you can rely on. The possibility of clipped reads extending past the contig termini also poses some questions, since the clipped bases could, potentially, map to different contigs (or not at all).

I have investigated this briefly, and between Picard MarkDuplicates and SAMTools rmdup, Picard appears to follow your strategy while SAMTools does not. I intend to look into this further, as time permits, and potentially add your patch in an upcoming update.

Best,
Mikkel

from paleomix.

MikkelSchubert avatar MikkelSchubert commented on June 27, 2024

Hey Graham,
I apologize for taking so long, but unfortunately I have been busy wrapping up several projects these last few months.

I have been working on a updated version of the script, inspired by your patch, which I intend to finalize by next week. The current version is attached, if you want to have a chance to look at it now.

Best,
Mikkel

rmdup_collapsed_softclip.txt

from paleomix.

grahamgower avatar grahamgower commented on June 27, 2024

Hi Mikkel,

I've looked over your new code and it looks right. I've not tested it much though. Do you have a test framework that you use for such things?

-G

from paleomix.

MikkelSchubert avatar MikkelSchubert commented on June 27, 2024

Hi Graham,
Unfortunately I do not yet have a framework for automatically testing that, though it is something I am interested in implementing. So I have manually been carrying out tests on various datasets, big and small, to ensure that the new version of rmdup_collapsed performs as expected. Long story short, I have have now released a version of PALEOMIX (v1.2.9) that includes the improved script, which should address this issue.

Once again thank you for reporting this issue, and apologies for taking so long. Do not hesitate to open additional issues if you should run into other problems with PALEOMIX.

Cheers

from paleomix.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.