Git Product home page Git Product logo

amof's Introduction

aMoF - aptamers Motif Finder - alpha version - In progress...

aMoF is a collection of Python scripts developed to to provide a full featured 'motif finder' tool focused on the analysis of APTAMERS selection results. Classical motifs (or kmers) finding tools take as input long bio-sequences like entire genomes or sets of genes. aMoF instead, focuses on sets of short sequences that are the typical results of an aptamer selection experiment. aMoF is meant to be user friendly and "dummy proof" so the code is compatible with both Python2 and Python3 and it relies ONLY on the Python Standard Library.

##Some background info Aptamers are short sequence of nucleotides or aminoacids (DNA, RNA, peptides, circular peptides and small proteins) that bind to a specific target molecule. They can be used for both basic research and clinical purposes as macromolecular drugs or as "atibody-like-particles" for molecular detection. Aptamers are naturally present in living organisms where they often act as molecular switches to regulate many different biological pathways. In molecular biology, there are two main methods to select aptamers: SELEX (for DNA and RNA) and Phage Display (for peptides). Both methods rely on the analysis of sequences to discriminate specifically enriched aptamers from background noise.

To know more about aptamers here are some links ;)

##Disclaimer Since I am not a professional programmer you may find the code structure "not conventional", you are free to comment and suggest improvements.

##Overview and How-To-Use #####aMoF.py It is the core program. It contains the "motifs finder" algorithm and the main variables.

#####seqPrep.py It prepares an appropriate formatted file to serve as input to aMof.py . It reads all the fas files (obtained by sequencing) contained in the folder FASfiles and it extracts each sequence and the relative id and writes them in a file. This program allow the user to better define from which kind of selection experiment the sequences are coming from.

#####aMoF webapp A Flask-based application with a minimalistic user interface for aMoF - In progress at http://beta-amof.rhcloud.com/

You can chose among this type of experiments

  • M13KE - p3 N-term display - NEB PhD-12 - Completed!
  • M13KE - p3 N-term display - NEB PhD-7 - Completed!
  • M13KE - p3 N-term display - NEB PhD-C7C - Completed!
  • M13KE - p8 N-term display - custom - Completed!
  • T7 Phage Display - In progress...
  • Ribosome Display - In progress...
  • DNA SELEX - In progress...
  • RNA SELEX - In progress...

With this information program will automatically slice out the sequence of interest from the whole sequencing result and it translate it if needed. For example, in case of M13 Phage Display PhD-C7C, a classic sequencing output, using company provided sequencing primers, looks like this:

original sequencing output
>12826357.seq - ID: CB8 - Phage Display - PhDC7C - Acid Elution 3 - Plate 2_A01-Amply M13 pIII CS - R on 2014/1/16-0:53:50 automatically edited with PhredPhrap, start with base no.: 41 Internal Params: Windowsize: 20, Goodqual: 19, Badqual: 10, Minseqlength: 50, nbadelimit: 1
agTTTTgTCGTCTTTcCagACGTTAGTAAATGAATTTTCtGtAtGGgattTTGCTAAACAACTTTCAACAGTTTCGGCCGAACCTCCACCGCACTCATAAGGCGAACCAGTATTACAAGCAGAGTGAGAATAGAAAGGTACCACTAAAGGAATTGCGAATAATAATTTTTTCACGTTGAAAATCTCCAAAAAAAAGGCTCCAAAAGGAGCCTTTAATTGTATCGGTTTATCAGCTTGCTTTCGAGGTGAATTTCTTAAACAGCTTGATACCGATAGTTGCGCCgACnATGACAACAACCATCGCCCACGCATAACCGATATATTCGGTCGCTGAGGCTTGCAGGGAGTTAAAGGCCGCTTTTGCGGGATCGTCACCCTCAGCAGCGAaagAcAGCATCGGAACGAGGGTAGCAACGGCTACAGAGGCTTTGAGGACTAAAGACTTTTTCATGAGGAAGTTTCCATTAAACGGGTAAAATACgTAATGCCACTACGAAGGCACCaaCCTAAAACgAaagAGGCAaaanAATACactAaaACACTCATCTTTGACCCCCAgcGATTATACCAAGCGCGAAACAAAGtACAACGGAGATTTGTATCATCGCCTGATAAATTGTGTCgAAnTCCGCGAcCTGCTCCATGTTACTTAgCCGGAACGAGGCGCAGACGGnCAATcannnnGGAaCCGAaCTGACCaACtTTGAAAgAggacnnAngAaCGGTGTACAGACCAGGCGCATAGGCTGGCTGAnCtTtCaTcaAnAgtAnnctTGAcAAGAACCGGATATTCATTAcCCnAAnTCaacGtACnaAGCTnGCtCaTTccannGAAnnAAGGctTGcCCTGAcgAgAAACacCananngAgtAGTAAATTGGgcTTtGAgATGgnnttattTCaaCTTTaatCaTTGTGAATTACcttntGCgAtTTnanAAcTGGCTcatta
after using seqPrep.py
>A01
NTGSPYE

Note that the repeated sequence at N-ter (AC) and C-term (CGGGS), distinctive of any PhD-C7C sequence, has been sliced out to exclude them from future statistical analysis

amof's People

Contributors

alec-djinn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.