Git Product home page Git Product logo

Comments (10)

onordesjo avatar onordesjo commented on June 23, 2024

Hi,

Thanks for the question. It's not yet possible, but I would suspect that it would be useful. We intend to release a better version of template/complement splitting today hopefully that should be better than adapter splitting for duplex.

from duplex-tools.

jagos01 avatar jagos01 commented on June 23, 2024

Thanks for your quick reply. I will try it out when it is released.

from duplex-tools.

onordesjo avatar onordesjo commented on June 23, 2024

Hi @jagos01, v0.2.20 is now out, and you can use this to recover reads which are non-split.

Feel free to try it out by

  1. simplex-calling (fast is ok):
$ dorado basecaller [email protected] pod5s/ --emit-moves > unmapped_reads_with_moves.sam
  1. run split_pairs like this:
duplex_tools split_pairs unmapped_reads_with_moves.sam pod5s/ pod5s_splitduplex/

This should give you new pod5s in the pod5s_splitduplex directory (with new read-ids), together with the pair_ids that correspond to the new read_ids.

Feel free to try it out and let me know how things are working.

from duplex-tools.

jagos01 avatar jagos01 commented on June 23, 2024

Hello @onordesjo, I followed the directions outlined in the readme for duplex calling with dorado. I generated the pair_id files for both step 2a and 2b. They contained 4667 and 7867 pairs respectively. When stereo basecalling those reads, dorado only basecalled 4114 and 1338 reads. Why is the number of stereo basecalled reads less than the number of read pairs?
Thanks

from duplex-tools.

onordesjo avatar onordesjo commented on June 23, 2024

Hi @jagos01. Can I ask what type of data you have been looking at? Whole genome? Any amplification? There is some filtering happening in Dorado to ensure that bad pairs don't get through, so that is to be expected. I would expect less pairs generated in step 2b than 2b but greater retention of good pairs. 2a would also necessarily have to be generated without a subset (or alternatively a selection of channels).

Any of this information would help to explain what you are seeing.

from duplex-tools.

jagos01 avatar jagos01 commented on June 23, 2024

Hello @onordesjo. This is bacterial whole genome sequence data. No amplification was carried out. The data is split over two runs (had to restart the sequencer a couple hours into the run). I was also expecting less pairs from 2b. 2a was generated from the complete data set.

from duplex-tools.

ollenordesjo avatar ollenordesjo commented on June 23, 2024

from duplex-tools.

jagos01 avatar jagos01 commented on June 23, 2024

I inspected the pod5 reads for each run and the unmapped BAM file contains reads from both runs.

from duplex-tools.

ollenordesjo avatar ollenordesjo commented on June 23, 2024

from duplex-tools.

jagos01 avatar jagos01 commented on June 23, 2024

Thanks, I have emailed a link to the bam file.

from duplex-tools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.