Git Product home page Git Product logo

hosnajabbari / iterative-hfold Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 4.0 5.92 MB

RNA Pseudoknotted Secondary Structure Prediction Using Relaxed Hierarchical Folding

Home Page: https://www.researchgate.net/publication/262810273_A_fast_and_robust_iterative_algorithm_for_prediction_of_RNA_pseudoknotted_secondary_structures

C++ 97.26% C 2.61% Shell 0.01% CMake 0.12%
rna-folding rna-structure-prediction hfold hfold-iterative iterative-hfold rna rna-structure pseudoknot-structures

iterative-hfold's Introduction

HFold Iterative

Description:

Software implementation of Iterative HFold.
Iterative HFold is an algorithm for predicting the pseudoknotted secondary structures of RNA using relaxed Hierarchical Folding.

Paper: Jabbari, H., Condon, A. A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures. BMC Bioinformatics 15, 147 (2014). https://doi.org/10.1186/1471-2105-15-147 (https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-15-147)

On the dataset tested in this paper, Iterative HFold generally has better accuracy that its predecessor, HFold.

Supported OS:

Linux, macOS

Conda Package:

conda install -c uvic-cobra iterative-hfold

Works for Linux and macOS

Source code Installation:

Requirements: A compiler that supports C++11 standard (tested with g++ version 4.9.0 or higher), Pthreads, and CMake version 3.1 or greater.

CMake version 3.1 or greater must be installed in a way that HFold can find it.
To test if your Mac or Linux system already has CMake, you can type into a terminal:

cmake --version

If it does not print a cmake version greater than or equal to 3.1, you will have to install CMake depending on your operating system.

Mac:

Easiest way is to install homebrew and use that to install CMake.
To do so, run the following from a terminal to install homebrew:  

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"   

When that finishes, run the following from a terminal to install CMake.    

brew install cmake   

Linux:

Run from a terminal

wget http://www.cmake.org/files/v3.8/cmake-3.8.2.tar.gz
tar xzf cmake-3.8.2.tar.gz
cd cmake-3.8.2
./configure
make
make install

Linux instructions source

Steps for installation

  1. Download the repository and extract the files onto your system.
  2. From a command line in the root directory (where this README.md is) run
cmake -H. -Bbuild
cmake --build build

If you need to specify a specific compiler, such as g++, you can instead run something like

cmake -H. -Bbuild -DCMAKE_CXX_COMPILER=g++
cmake --build build

This can be useful if you are getting errors about your compiler not having C++11 features.

After installing you can move the executables wherever you wish, but you should not delete or move the simfold folder, or you must recompile the executables. If you move the folders and wish to recompile, you should first delete the created "build" folder before recompiling.

How to use:

Arguments:
    HFold_iterative:
        -r <structure>
        -i </path/to/file>
        -o </path/to/file>
        -v
        -V
        -n <number of outputs>
        -p

    Remarks:
        make sure the <arguments> are enclosed in "", for example -r "..().." instead of -r ..()..
        The sequence does not need to be enclosed and can be given before or after the other arguments
        if no structure is provided through -r , the input structure will be the hotspot with the lowest free energy
        if -i is provided with just a file name without a path, it is assuming the file is in the diretory where the executable is called
        if -o is provided with just a file name without a path, the output file will be generated in the diretory where the executable is called
        if -v is provided, a verbose output will be given (method used is outputted)
        if -V is provided, the version is given
        if -n is provided with a number, it will modify the number of hotspots looked and outputs given (the base is 1)
        if -p is provided, it will change the output to pseudoknot-free


Sequence requirements:
    containing only characters GCAUT

Structure requirements:
    -pseudoknot free
    -containing only characters ._(){}[]
    Remarks:
        Restricted structure symbols:
            () restricted base pair
            _ no restriction

Input file requirements:
        Line1: Name (optional, but must be fasta format; ignored in final input)
        Line2: Sequence (required)
        Line3: Structure (optional)
    sample:
        >Srp_005
        GCAACGAUGACAUACAUCGCUAGUCGACGC
        (____________________________)

Example:

assume you are in the directory where the HFold_iterative executable is loacted
./HFold_iterative -i "/home/username/Desktop/myinputfile.txt"
./HFold_iterative -i "/home/username/Desktop/myinputfile.txt" -o "outputfile.txt"
./HFold_iterative -i "/home/username/Desktop/myinputfile.txt" -o "/home/username/Desktop/some_folder/outputfile.txt"
./HFold_iterative GCAACGAUGACAUACAUCGCUAGUCGACGC -r "(____________________________)"
./HFold_iterative GCAACGAUGACAUACAUCGCUAGUCGACGC -r "(____________________________)" -o "outputfile.txt"
./HFold_iterative GCAACGAUGACAUACAUCGCUAGUCGACGC
./HFold_iterative GCAACGAUGACAUACAUCGCUAGUCGACGC -n 10
./HFold_iterative GCAACGAUGACAUACAUCGCUAGUCGACGC -p

Exit code:

0       success
1	    invalid argument error 
3	    thread error
4       i/o error
5       pipe error
6       positive energy error
error code with special meaning: http://tldp.org/LDP/abs/html/exitcodes.html
2	    Misuse of shell builtins (according to Bash documentation)
126	    Command invoked cannot execute
127	    "command not found"
128	    Invalid argument to exit	
128+n	Fatal error signal "n"
130	    Script terminated by Control-C
255	    Exit status out of range (range is 0-255)

iterative-hfold's People

Contributors

hosnajabbari avatar ianwark avatar mahyarhosseini avatar mateog4712 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

iterative-hfold's Issues

ERROR: could not find energy

The program gives the following output for some of the inputs.

Input:
./HFold_iterative -s "TGCCTCCGGTTCTGAAGGTGTTCTT" -r "______________((((___))))"

Output:

Energy method 1: 0.000000 
Energy method 2: 0.000000 
Energy method 3: 0.000000 
Energy method 4: 0.000000 
ERROR: could not find energy

In the program, the final_energy is initialized to be infinity and the code currently ignores the result of each method if it is zero.

The input is tested by different parts of code used in iterative_HFold, and the results are:

             Output structure       Output Energy
simfold: ..............((((...))))      1.41
pkonly:  .[[...........((((.]]))))      0.00
hfold:   .[[...........((((.]]))))      0.00

The other inputs that have the same problem:

./HFold_iterative -s "GCCTCCGGTTCTGAAGGTGTTCTTG" -r "_____________((((___))))_"
./HFold_iterative -s "CTTAACGTCAAATGGTCCTTCTTGG" -r "_______((((__________))))"

Memory leakage

The memory leakage causes the program to crash, especially, when many instances of the program are run on the same machine.

The error is:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

The valgrind result is:

==27356== LEAK SUMMARY:
==27356==    definitely lost: 321,952 bytes in 40,008 blocks
==27356==    indirectly lost: 3,201,375,161 bytes in 589 blocks
==27356==      possibly lost: 19,200,480,000 bytes in 6 blocks
==27356==    still reachable: 72,704 bytes in 1 blocks
==27356==         suppressed: 0 bytes in 0 blocks
==27356== Reachable blocks (those to which a pointer was found) are not shown.
==27356== To see them, rerun with: --leak-check=full --show-reachable=yes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.