Git Product home page Git Product logo

tfold-release's Introduction

AlphaFold-based pipeline for prediction of peptide-MHC structures.

Please cite as:
V. Mikhaylov, A. J. Levine, "Accurate modeling of peptide-MHC structures with AlphaFold,"
bioRxiv 2023.03.06.531396; doi: https://doi.org/10.1101/2023.03.06.531396

Download and install

  1. Download AlphaFold and its parameters. (This pipeline was tested with AlphaFold 2.1.0.) No need to download PDB and the protein databases.

  2. Clone this repository:

git clone https://github.com/v-mikhaylov/tfold-release.git

Enter the tfold-release folder.

  1. Install the dependencies. With conda, you should be able to create an environment that would work for both TFold pipeline and AlphaFold:
conda env create --file tfold-env.yml
conda activate tfold-env

(This environment for running AlphaFold outside of Docker is due to https://github.com/kalininalab/alphafold_non_docker.)

  1. Download the data file data.tar.gz with templates and other information from Zenodo, https://zenodo.org/record/7803946. This can be done in web browser or using zenodo-get:
pip install zenodo-get
zenodo_get 7803946

Unpack data.tar.gz into the tfold-release folder. This will create a folder data.

  1. Set paths to a couple folders in tfold/config.py and tfold_patch/tfold_config.py.

  2. That should be it.

Model pMHCs

  1. Prepare an input file. An example can be found in data/examples/sample.csv. It should be a .csv file with a header and with columns pep and MHC allele or MHC sequence.
  • The format for MHC alleles is SpeciesId-Locus*Allele for class I and SpeciesId-LocusA*AlleleA/LocusB*AlleleB for class II. Some examples: HLA-A*02:01, H2-K*d, HLA-DRA*01:01/DRB4*01:144, H2-IEA*d/IEB*k.
  • For class II, the MHC sequence should contain alpha-chain and beta-chain sequences separated by '/'.
  • For more details and options, please see details.ipynb.
  1. Activate conda environment:
conda activate tfold-env
  1. Choose an output folder $working_dir and run the script as follows:
model_pmhcs.sh $input_file $working_dir [-d YYYY-MM-DD]

Here [-d YYYY-MM-DD] is an optional cutoff on the allowed template dates.

  1. The models will be saved in $working_dir/outputs$, with a separate folder for each pMHC. There will also be a summary .csv file in $working_dir with information about the best models (by predicted score).

Details

The notebook details.ipynb contains some additional details on the pipeline that can be useful e.g. for splitting the jobs over multiple GPUs. It also contains a description of our cleaned pMHC and TCR structure database and associated tools.

tfold-release's People

Contributors

v-mikhaylov avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.