Git Product home page Git Product logo

mlpeptide's Introduction

Machine Learning Designs Non-Hemolytic Antimicrobial Peptides

Scope

Antibiotic resistance is one of the major threats to global public health, and antimicrobial peptides (AMPs) are an important resource against it. However, the discovery of novel AMPs is challenging because they are often toxic towards human blood cells. Here, we set to understand if machine learning (ML) can help in design non-hemolytic AMPs and accelerate the discovery of novel AMPs.

Methodology

We extracted a highly reliable dataset of AMPs and non-AMPs, as well as hemolytic and non-hemolytic peptides from the DBAASP, a manually curated antimicrobial peptide database. We used the data to train a generative peptide model (prior model), an AMP activity classifier, and a hemolysis classifier. Two copies of the prior model were fine-tuned using active and non-hemolytic peptides against specific strains: P. aeruginosa/A. baumannii and S. aureus, respectively. The fine-tuned models were sampled, and the generated sequences were filtered using the implemented classifiers, basic physicochemical properties, and novelty criteria to obtain short peptides of maximum 15 residues with at least 5 mutations from the sequences in DBAASP.

Results

Our best compounds GN2 and GP6 are highly active, non-hemolytic and they show antimicrobial activity also towards multidrug-resistant strains.

This repository contains the code for:

  • DBAASP AMP actvity data (notebook 01) and hemolyisis data (notebook 01b) extraction
  • implementation of the AMP activity classifiers (NB, RF, SVM notebook 02; RNN notebook 03; RNN scr. labels notebook 04) and their evaluations
  • implementation of the hemolysis classifiers (NB, RF, SVM notebook 02b; RNN notebook 03b; RNN scr. labels notebook 04b) and their evaluations
  • implementation of the AMPs RNN generative prior model (notebook 05);
  • fine-tuning of the generative model through transfer learning (P. aeruginosa/A. baumannii notebook 06; S. aureus notebook 07)
  • properties calculation of the generated sequences (notebook 08)
  • properties analysis of the generated sequences (notebook 09)
  • selection for synthesis procedure (notebook 10)

to predict the apha helix % we used SPIDER3

required environment installation:

  • conda env create -f environment.yml
  • conda activate aipep

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.