Git Product home page Git Product logo

afcarl / flu-sequence-predictor Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ericmjl/flu-sequence-predictor

0.0 0.0 0.0 123.07 MB

An experimental deep learning & genotype network-based system for predicting new influenza protein sequences.

Home Page: https://fluforecaster.herokuapp.com/

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.03% Python 0.90% Jupyter Notebook 97.52% HTML 1.56%

flu-sequence-predictor's Introduction

Flu Forecaster

An experimental deep learning & genotype network-based system for predicting new influenza protein sequences.

Rationale

Flu Forecaster was first fully developed during my time as an Insight Health Data Fellow. The projected business use case is to predict what future strains of flu will look like, which would thus help inform the pre-emptive development of vaccines. No longer would we have to select currently-circulating strains; instead, we could forecast what strains would look like 6 months down the road, pre-emptively synthesize them (using synthetic biology methods), and rapidly scale up production of the ones used for the flu shot vaccine.

Getting Started

Viewing Flu Forecaster

Flu Forecaster is hosted on Heroku, but can also be hosted locally. An internet connection is required for the display components and for downloading data from this GitHub repository.

Developing Flu Forecaster

To develop Flu Forecaster, it is recommended that you have access to a GPU.

Architecture

Flu Forecaster's dashboard is written using Flask, Bokeh and pandas.

Flu Forecaster's back-end is powered by Jupyter notebooks that use machine learning algorithms implemented in Keras and PyMC3.

Keras is used for the variational autoencoders, which learn a continuous, lower dimensional, and latent embedding of influenza protein sequence space, and can generate new sequences from that space.

PyMC3 is used for gaussian process regression, which allows us to model arbitrary time-varying functions, thus providing a way to forecast where (in the latent space) influenza is evolving towards. Once we have those forecasted latent space coordinates, we can decode them back into protein sequences.

Science

Flu Biology

Behind all of this is some real influenza biology. Influenza evolves under evolutionary pressure, arising from our immune systems, vaccines, drug treatments, and possibly more pressure sources not mentioned here. The process of evolution essentially results in taking currently-circulating viruses, weeding out those that are "unfit", and making new viruses based on currently-circulating ones. Thus, current strains are the substrate for future strains.

Forecasting Evolution with VAEs & GPs

Yet, when we talk about "sequences", there's no notion of "direction". Who's to say that the mutation of one position from a K to an L is moving "forward" in time? An alternative is to learn vectors in continuous space. This is why we use a variational autoencoder (VAE). The VAE translates the discrete representation of our sequence data (a one-of-K encoding of amino acids at each position) into a continuous, "latent" representation.

Once we're in the latent representation, then we can train a Gaussian Processes (GPs) regression model on the data and use it to make extrapolations.

flu-sequence-predictor's People

Contributors

ericmjl avatar imgbotapp avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.