Git Product home page Git Product logo

genefinder.jl's Introduction

Hello there, I'm Camilo

This is my README profile. Hope you get through my repos and find something useful, here some stats:

genefinder.jl's People

Contributors

camilogarciabotero avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

vdejager

genefinder.jl's Issues

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Using score to filter what `getorfs` delivers

After #26 and #32 we can now have a more flexible way to use the findorfs with multiple ORF finder methods with or without scoring scheme. Now, we can levearege on that to make getorfs more complex by adding a scoring filter to get only the sequences that actually are above a scoring threshold. For instance the argmax to the orf.score field will help.

orfs[argmax([orf.score for orf in orfs])]

We can also use a combination of sorting and filtering:

sortedorfs = sort(orfs, by = orf -> -orf.score)
sortedorfs[1:min(10, end)]

The function will gain a min_score kwarg:

function getorfs(
    sequence::NucleicSeqOrView{DNAAlphabet{N}},
    ::DNAAlphabet{N},
    method::M;
    kwargs...
    min_score=0
) where {N,M<:GeneFinderMethod}
 ...
end

Still to define...

The `iscoding` should be more generic.

Since we want to apply this predicate to any sequence eventually, the way in which different algortihms/implementations consider that a sequence is probably encoding information varies. Some, as the current naivefinder considers models, then it will use that input information. Now, to be more generic the general iscoding (currently changed to isnaivecoding should be something like:

function iscoding(sequence::LongSeqOrView{DNAAlphabet{N}}, method::Function; kwargs...) where {N}
    ...
end

Reconsider start and stop in the IO methods

The ORF struct is normally defined with a location field that is of type UnitRange{Int64}. This has been used with the default step (i.e., 1) argument. So even if the strand field of ORF is - the start will always be determined by the "positive" strand range.

This is not an issue for the get_orfs_* methods since they use the following treatment:

Base.getindex(sequence::NucleicSeqOrView{A}, orf::ORF) where {A} = orf.strand == '+' ? (@view sequence[orf.location]) : reverse_complement(@view sequence[orf.location])

The inverted range is, for instance, how negative stranded ORF are displayed in PHANOTATE outputs (c.f source code).

The things to reconsider are:

  • Are the other ORF applications using this convention as well?
  • Would revamping this bring some benefits to the performance?
  • The write methods should at least advertise this. However, judging by the previous test with IGV it is found to have only positive ranges at start and stop.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.