Git Product home page Git Product logo

Comments (6)

rhpvorderman avatar rhpvorderman commented on August 28, 2024 1

I agree. I wanted to offer a quick fix that could be used now, and then failed to answer the original question. Sorry.

Note that a proper function should also work for wildcards and lowercase characters (and the example you gave is missing the "reversing" part.)

And then gave a bad answer too by not reversing... It's not a great day today.

I wouldn’t mind moving this to dnaio. Just as we have methods for slicing a SequenceRecord, computing a reverse complement is also a thing one regularly needs to do with DNA sequences, so it’s IMO natural to offer this.

I see cutadapt uses a translate table then reverses using Python. This is fast enough, but there are two caveats

  • Python uses a dict to translate, because there are so many unicode code points. Since we guarantee 1-byte kind strings we can use a 256 character table. (bytes.maketrans automatically creates one of these as well).
  • We can do the reversing and translation at the same time. Creating a new unicode object and then populating it with the translated nucleotides in reverse.

When we do that in Cython it should be quite a bit faster than the python method so this should also be a win for cutadapt.

SequenceRecord and BytesSequenceRecord can share the underlying method easily I think. Just as with the is_mate function.

from dnaio.

rhpvorderman avatar rhpvorderman commented on August 28, 2024

It is python builtin:

from dnaio import Sequence

complement_table = str.maketrans(dict(A="T", C="G", G="C", T="A"))
my_sequence = Sequence("my_seq", "GATTACA", "HHHHHHH")
my_sequence.sequence.translate(complement_table)

Results in 'CTAATGT'

Does this work for you?

from dnaio.

marcelm avatar marcelm commented on August 28, 2024

Reverse complements aren’t hard, but there are things one can get wrong, so it’s worth having a method for it. I would definitely not call it "built in". Note that a proper function should also work for wildcards and lowercase characters (and the example you gave is missing the "reversing" part.)

Here is what I use in Cutadapt:
https://github.com/marcelm/cutadapt/blob/26d3f39b6a37bcee8383d7d8f4a95879528137e4/src/cutadapt/utils.py#L172-L189

I wouldn’t mind moving this to dnaio. Just as we have methods for slicing a SequenceRecord, computing a reverse complement is also a thing one regularly needs to do with DNA sequences, so it’s IMO natural to offer this.

from dnaio.

marcelm avatar marcelm commented on August 28, 2024

Oh great, it sounded as if you were opposed to adding the functionality. And I’m of course all for speeding things up!

from dnaio.

jazberna1 avatar jazberna1 commented on August 28, 2024

Hi all,

Thanks so much for the comments. I'll have a go with str.maketrans and translate

Best
Jorge

from dnaio.

rhpvorderman avatar rhpvorderman commented on August 28, 2024

Thanks so much for the comments. I'll have a go with str.maketrans and translate

And for the reverse part there is a built-in reversed function: https://docs.python.org/3/library/functions.html#reversed . Good luck!

In the meantime we may hammer out a method, but I can give no ETA at this point. My time is rather limited now.

from dnaio.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.