Git Product home page Git Product logo

host4phage's Introduction

Host4Phage

A tool to identify bacterial hosts for phages on the basis of genomic sequences of bacteriophages and bacteria. Host4Phage uses bacterial CRISPR-Cas system for this purpose. The tool supports multithreading.

1. Tools used by Host4phage

Host4phage uses other available tools:

PILER-CR ---> Reference | Source
CRT ---> Reference | Source
MinCED --> Source
CRISPRDetect --> Reference | Source
Kmer-db --> Reference | Source

All the above mentioned tools will be called from the tool/bin folder.

2. Requirements

  • To run host4phage.py you'll need Python 3.8.8 or greater.
  • Python dependencies: tqdm --> pip install tqdm and joblib --> pip install joblib. Check out --> tqdm joblib.
  • CRT and MinCED tools require Java Runtime Environment.
  • CRISPRDetect tool requires the following tools: clustalw water seqret RNAfold cd-hit-est blastn. Check out--> CRISPRDetect.
  • FASTA extension for input files is required --> (*.fasta, *.fna, *.fa)

3. Description and usage

The tool uses two subcommands: spacers and compare.

  • spacers subcommand is responsible for identifying and extracting spacers.
  • compare subcommand is responsible for finding common sequences for hosts and bacteriophages by using k-mers.

Host4Phage with spacers subcommand can be called from the command line in the following way (quick usage):
python tool/host4phage.py spacers -i host_20_test -o output_spacers/piler -m piler

Host4Phage with compare subcommand can be called from the command line in the following way (quick usage):
python tool/host4phage.py compare -s output_spacers -v virus_20_test -o output_compare

Parameters - spacers subcommand:

Name Requiredness Description
-input/-i obligatory Directory path with bacterial genomes - files should
contain FASTA extension (*.fasta, *.fna, *.fa).
-method/-m obligatory Method for CRISPR sequence identification
- piler/crt/minced/crisprdetect.
-threads/-t optional Number of threads - is adjusted by default to the
number of processor threads in a user's computer.
-output/-o optional Directory path where two subdirectories will be created:
output containing result files of the selected method
and fasta containing extracted spacers - by default,
the directory named spacers will be created.

Parameters - compare subcommand:

Name Requiredness Description
-spacers/-s obligatory Directory path with extracted spacers - you can combine
results from all methods for identyfing CRISPR sequences
in two ways. The first one is to pass a directory where
subdirectories with the result files are located (e.g., the
output_spacers directory will contain subdirectories
with spacers for all methods and you can use only
-s output_spacers). The second one is to pass paths
to the results of each method separately in a single
command. Files with spacers should contain FASTA
extension (*.fasta, *.fna, *.fa).
-virus/-v obligatory Directory path with bacteriophage genomes - files should
contain FASTA extension (*.fasta, *.fna, *.fa).
-k optional Length of k-mers - viral genomes and CRISPR spacers
found in hosts will be divided into sequences of the
given length - by default, k = 18.
-threads/-t optional Number of threads - is adjusted by default to the number
of processor threads in a user's computer.
-output/-o optional Directory path where a file with .CSV extension will be
created - by default, the directory will be named
comparison. The file will contain number of common
k-mers for each bacterial and bacteriophage species.


You can also find the description of the parameters by using python tool/host4phage.py --help.

host4phage's People

Contributors

ambioinformatics avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.