Git Product home page Git Product logo

hshadman / 2d_conformational_landscape_map Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 23.99 MB

Use PyConforMap to generate a simple scatter plot to map conformational landscapes of intrinsically disordered proteins, and quantify conformational diversity.

License: MIT License

Jupyter Notebook 93.62% Python 6.38%
conformational-analysis data-science data-structures data-visualization intrinsically-disordered matplotlib molecular-dynamics-simulation monte-carlo-simulation object-oriented-programming pearson-correlation-coefficient

2d_conformational_landscape_map's Introduction

DOI

PyConforMap: Draw pretty maps of your polymer or disordered protein conformational ensembles!

This repository provides an easy-to-implement python module called PyConforMap that generates scatter plots of instantaneous shape ratio (Rs) against relative radius of gyration (Rg/Rgmean).

PLEASE READ ALL DOCUMENTATION

There are two main main metrics: the relative radius of gyration (Rg/Rgmean) and the instantaneous shape ratio (Rs). Rs is computed as Rs = Ree2/Rg2 where Ree and Rg are (instantaneous) end-to-end distance and (instantaneous) radius of gyration respectively.

The Rg/Rgmean is a measure of (relative) size for a protein or polymer chain, and Rs is a measure of its shape. Rs is expected to be low (~2 or lower) for compact structures and high for highly extended structures (~12 or higher). A single Rg/Rgmean value and corresponding Rs value for a polymer together is how we define its instantaneous conformation. When all the Rg/Rgmean and Rs values of a polymer are plotted together, they constitute what we call a 2D map of the conformational landscape of that polymer.

The PyConforMap Module Generates Two-Dimensional Scatter Plots

This module generates 2D scatter plots of Rs against Rg/Rgmean for a protein/polymer simulation (data and protein label/identity provided by user) and a Gausssian Walk (GW) polymer model simulation (data for 720000 snapshots of a GW model of length 100 included with repository). Each point on the scatter plot (belonging to either GW or a given protein/polymer) represents a conformation snapshot, and has coordinates (Rg/Rgmean, Rs). The GW model is intended to be a reference model, whose conformational landscape map (i.e. as represented by all the (Rg/Rgmean, Rs) points) serves as a 'universal' or reference map for those of other proteins/polymers. Using the 2D scatter plot, an fC, representing the fraction of the GW points 'close' (i.e. within a pre-defined radius) to at least one protein/polymer point, is automatically calculated. fC is a quantity that represents the conformational diversity of the protein/polymer provided, and can be used to rank the conformational diversities of different proteins/polymers. The included GW file is 'GW_chainlen100.csv.' The python module can be additionally used to conduct a new GW simulation with different chain length and number of snapshots, should the user wish to do so. On the scatter plot, it is important that the protein/polymer points do not significantly exceed the boundaries defined by the reference (GW) points. Most of the protein/polymer points should be 'close' (i.e. within a pre-defined radius) to at least one GW point.

The Module Code Requires One Input File

The needed input is a csv file (for a given protein/polymer simulation) with 2 columns. The first column contains Rg2 values and the second column contains Ree2 values. In this (user provided) file, each row represents a protein/polymer conformation snapshot from the simulation. An example input is the 'example_protein.csv' csv file (included with repository).

Files Included with Repository

The 'code_input_output.md' file provides technical details (input arguments, expected outputs) of the module. The 'pyconformap.py' file contains the source code for the module. The 'illustrated_example.ipynb' jupyter notebook file shows examples to illustrate implementation of the code. The 'GW_chainlen100.csv' is the reference GW simulation and 'example_protein.csv' is the simulation of an example protein.

Packages Required

The module requires the pandas, numpy, matplotlib, scipy, itertools, more_itertools and random python packages. They are automatically loaded when the 'pyconformap.py' file is executed, as shown in the illustrated examples.

Publication

PyConforMap is a companion to a paper that is under review for publication, as of 15 Feb, 2024.

How to Cite

If you use this module, please cite us using the provided DOI.

Contact Information

If you have comments/suggestions or a bug report, please feel free to email me at [email protected], or contact me through my social media links provided in the home page.

2d_conformational_landscape_map's People

Contributors

hshadman avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.