Git Product home page Git Product logo

imminfo / ymir Goto Github PK

View Code? Open in Web Editor NEW
1.0 6.0 1.0 2.68 MB

(Under development) Computational framework for probabilistic models of immune receptor assembling.

Home Page: https://imminfo.github.io/ymir

License: Apache License 2.0

C++ 98.90% CMake 0.04% Python 0.66% R 0.13% C 0.26%
bioinformatics immunoinformatics immunology probabilistic-models vdj-recombination t-cell-receptor immunoglobulin tcr ig statistical-immunoinformatics

ymir's People

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

meng0625

ymir's Issues

Roadmap / todos

2.0 version

MAAG

  • Implement shifts in event probabilities.
    • In Clonotypes.
    • In MAAGBuilder.
  • Replace generation probability computing with forward-algo-like procedure.
  • Add MarkovChain with errors.
    • Tests
  • Implement errors in MAAGBuilder.
    • V.
    • D.
    • J.
  • Implement errors in MAAGForward-backward.
    • VJ (waiting for 100K test)
    • VDJ
  • Implement errors in alignments.
  • Fix replacement of MAAG event probe in MAAGBuilder.
  • Add move assignment operator to MAAG.
  • With which value initialise error probability?

PAM

  • Implement a PAM + inference algorithm with errors in alignments.
    • VJ
    • VDJ
  • Fix segfault
    "There are four common mistakes that lead to segmentation faults: dereferencing NULL, dereferencing an uninitialized pointer, dereferencing a pointer that has been freed (or deleted, in C++) or that has gone out of scope (in the case of arrays declared in functions), and writing off the end of an array.
    A fifth way of causing a segfault is a recursive function that uses all of the stack space. On some systems, this will cause a "stack overflow" report, and on others, it will merely appear as another type of segmentation fault. "

IO

  • Fix Python converter (V / D / J alignments column instead of starts/ends columns)
  • Fix writer
  • Refactor parser.
  • Refactor parser with the new aligner with virtual functions instead of templates.
  • Implement a separate class for align all genes on clonotypes sequences. Pass it as a object to Parser if you (user) want to.
    • Implement SW local aligner for Variable genes.
    • Implement SW local aligner for Joining genes.
  • Add translation subroutine.
  • Add aligner parameters for alignment - thresholds for length / score, etc.

2.1 version

MAAG

  • Add MarkovChain to MAAG (for amino acids).
    • VJ
    • VDJ
  • Implement MAAGaa
    • VJ
    • VDJ
  • Implement amino acid sequence MAAG builder.
    • Tests.

IO

  • Implement amino acid aligner.
    • VJ
      • Tests.
    • VDJ
      • Tests.

2.2 version

PAM

  • Data diversity measure.
  • Implement and test new secret EM algorithm.
    • Save #iter for each parameter, not globally.

2.3 version

Optimisations

2.4 version

Docs

  • Add support for high precision numbers or decide to work only with long doubles.
  • Write API documentation using Doxygen.
  • Write general / usage documentation using MkDocs.
  • Publish all documentation on GitHub pages.

2.5 version

IO

  • MAAG serialization.
    • Binary representation.
      • Tests.
    • Reading.
      • Tests.
    • Writing.
      • Tests.
  • ??? Memory mapped MAAG repertoire in case of very large files (align -> save to disk -> read from the memory mapped file).

Far Future

MAAG

  • Add checks for zero or error gene segments and other events in MAAG builder.

AAPAG

  • Implement AAPAG (Amino Acid Pattern Assembly Graph).
  • Implement fast generation of neighbour amino acid sequences.

Optimisations

  • Play with SIMD https://github.com/p12tic/libsimdpp
    • markov chains, probs in forward-backward
    • computing of full probabilities
  • Rewrite all using templates - in this case code will be without unnecessary "ifs". Basic scripts (compute, inference and generate) for each possible recombination.
  • Do return value optimisation everywhere when possible.
  • Check if lazy evaluation can be added anywhere.
  • Decide to refactor or not MarkovChain in MAAGBuilder.
  • Branching (if - statements) optimisations.
    • Try to always build event indices MMC, just do not include it to the resulting MAAG.
    • Move if (full_build) from the cycles to their own out cycles with only one cycle in MAAGBuilder.
    • ?: instead of if-else in MAAGBuilder deletions and insertions.
  • Check speed in ClonotypeBuilder in returning void vs returning ClonotypeBuilder& procedures.
  • Use fixed-size matrices in some cases like VJ deletions because all VJ gene segments sequences are pretty similar in size. (???)
  • Rewrite ModelParameterVector with plain arrays.
  • Optimise sequence class (currently std::string, need speed and memory improvements using bit vectors).
  • Compilation options which removes all verbosing for speed.

Refactoring

  • Replace all raw pointer with std::unique_ptr.
  • Add Google Test instead of my test.
  • Shared ptr for VDJRecombinationGenes.

Other

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.