This repository contains the original implementation of NORTH, a machine learning based tool to predict ortholog clusters.
See our server implementation : North Server
See our desktop implementation : North App
NORTH works on predefined ortholog clusters. A Multinomial Naive Bayes model is trained to predict the ortholog clusters, drawing inspiration from typical BLAST based pipelines.
However, being trained on a set of predefined clusters, cases may occur when we are faced with a gene out of those predefined clusters. To overcome this issue NORTH also has a robust outlier detection method.
NORTH has been able to outperform the existing methods in the OrthoBench benchmark, when evaluated using a stratified 5-fold cross validation test.
The codes for NORTH are written in python and can be found here
- Numpy
- Scipy
- BioPython
- Scikit-learn
- tqdm
- Matplotlib
- Seaborn
The NORTH pipeline is demonstrated in the following Jupyter Notebook