Git Product home page Git Product logo

repeatmatcher's Introduction

RepeatMatcher

Simple analysis tool to manually annotate RepeatModeler sequences.

** Under development, please contact me before using it.**

MOTIVATION After running RepeatModeler you probably will have a bunch of sequences to be manually annotated. RepeatMatcher is designed to help in this hard task. We provide a simple pipeline to generate all the required files and a graphical user interface to manipulate and annotate the sequences. The GUI is intented to show all the information to make the decisions easy for each sequence.

THE PIPELINE We started with the RepeatModeler output file (fasta), the steps are:

  1. Masking all the low complexity regions with RepeatMasker and discard sequences with filters for a minimal composition.
  2. All sequences will be compared with itself using Cross_match.
  3. All sequences are compared in nucleotide space with known repeats consensi with Cross_match.
  4. All sequences are compared in peptide space with known repeats consensi with Blastx.
  5. All sequences are compared in peptide space withthe peptides in NCBI NR database with Blastx.
  6. All sequences are folded with RNAVienna and the principal structure is plotted with R:R4RNA.

To run the pipeline, first you need to resolve all the dependencies:

  • RepeatMasker
  • Blast
  • Cross_match
  • RNAVienna
  • R
  • R:R4RNA
  • Databases: NCBI NR, known repeats (in nucleotides and peptides).
  • Perl:Tk (for the GUI)

Then, check if it's required to adjust the parameters in RepeatMatcher.conf.

Launching the pipeline: perl RepeatMatcher.pl -s sequences.fa -k knownrepeats.fa -o NewAnnotation

for a complete list of options: perl RepeatMatcher.pl --help

THE GUI After you ran the pipeline, you have all the files required for RepeatMatcherGUI.pl. The GUI uses a text file to keep all the parameters and annotations made (the LOG file). The first lines of that file record the files used, the folowing lines records all changes for each sequence.

Launching the GUI for the first time: perl RepeatMatcherGUI.pl -o OUT -i SEQS -s SELF -a ALIGN -b BLAST -n BLAST -f FOLD -l LOG

Launching the GUI with a generated LOG: perl RepeatMatcherGUI.pl -r LOG

Juan Caballero, Institute for Systems Biology @ 2012

repeatmatcher's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.