Git Product home page Git Product logo

compass's Introduction

Compass 🧭: A Comprehensive Tool for Accurate and Efficient Molecular Docking in Inference and Fine-Tuning

Navigating Future Drugs with Compass 🧭

Official Implementation of Compass: A Comprehensive Tool for Accurate and Efficient Molecular Docking in Inference and Fine-Tuning paper.

Developed by Ahmet Sarıgün*, Vedran Franke, and Altuna Akalin, Compass is designed for accurate and efficient molecular docking in both inference and fine-tuning phases. This repository provides the necessary code and instructions to utilize the method effectively.

Should you have any questions or encounter issues, please feel free to open an issue on this repository or contact us directly at [email protected].

Check out our paper below for more details:

Compass: A Comprehensive Tool for Accurate and Efficient Molecular Docking in Inference and Fine-Tuning,
Ahmet Sarıgün, Vedran Franke, Altuna Akalin
Arxiv, 2024

Usage

Setup Environment

Set up your development environment using Anaconda. Start by cloning the repository:

git clone https://github.com/BIMSBbioinfo/Compass.git

Once you have cloned the repository, navigate to its root directory and execute the following commands to create and activate the compass environment:

conda env create --file environment.yml
conda activate compass

For additional details on managing conda environments, refer to the conda documentation.

Docking with Compass 🧭 in Inference Mode

Our approach for inference aligns with the method used in DiffDock. The same data formats are applicable here as well.

For protein inputs, you can use .pdb files or provide sequences that will be folded using ESMFold. For the ligands, inputs can be in the form of a SMILES string or files readable by RDKit, such as .sdf or .mol2.

To process a single complex, specify the protein using --protein_path protein.pdb or --protein_sequence GIQSYCTPPYSVLQDPPQPVV, and the ligand using --ligand_description ligand.sdf or --ligand_description "COc(cc1)ccc1C#N".

If you want to do a redocking with recursion, you can use --max_recursion_step.

And you are ready to run inference for compass with single complex:

python -W ignore -m main_inference --config DiffDock/default_inference_args.yaml  --protein_path example/proteins/1a46_protein_processed.pdb  --ligand_description  "C1=CN=C(N1)CCNC(=O)CCCC(=O)NCCC2=NC=CN2"  --out_dir results/user_predictions_small --max_recursion_step 2

You will get Binding Affinity Energy, Strain Energy of Ligand, Number of Steric Clashes of Complex and Interaction Information of Complex. Also, you'll get the protein pocket in .pdb in pockets/ where you save your results in --out_dir to better understand the region of docked molecule in protein pocket.

If you have multiple protein target files and multiple ligand files/SMILES you want to run, give protein files' direction with --protein_dir and indicate the range of them with --protein_start and --protein_end. Also if you have .txt file containing SMILES, you can give the direction with --smiles_dir and range them with --smiles_start and --smiles_end.

Now you can run a couple of proteins and ligands at the same inference run:

python -W ignore -m main_inference --config DiffDock/default_inference_args.yaml  --protein_dir example/proteins  --smiles_dir  example/smiles.txt  --out_dir results/user_predictions_small --max_recursion_step 1  --protein_start 0 --protein_end 2 --smiles_start 0 --smiles_end 2

Datasets

Only the PDBBind dataset is utilized in this project. The data processing guidelines provided in DiffDock and the steps for generating ESM Embeddings are also applicable here.

Compass 🧭 in Fine-Tuning Mode

After generating ESM embeddings, run the Inference Mode once to download the pretrained DiffDock-L. Now, we're ready to finetune DiffDock with Compass:

python -W ignore -m finetune --config experiments/model_parameters.yml

Citation

please cite the following paper if you use this code/repository in your research:

@article{sarigun2024compass,
  title={Compass: A Comprehensive Tool for Accurate and Efficient Molecular Docking in Inference and Fine-Tuning},
  author={Sarigun, Ahmet and Franke, Vedran and Akalin, Altuna},
  journal={arXiv preprint arXiv:2406.06841},
  year={2024}
}

License

This code is available for non-commercial scientific research purposes as will be defined in the LICENSE file which is Attribution-NonCommercial-NoDerivatives 4.0 International. By downloading and using this code you agree to the terms in the LICENSE. Third-party datasets and software are subject to their respective licenses.

Components of the code of the spyrmsd by Rocco Meli (MIT license), DiffDock by Gabriele Corso (MIT license), AA-Score by Xiaolin Pan (GNU General Public License v2.0) and PoseCheck by Charlie Harris (MIT license) were integrated in the repo.

Acknowledgements

We extend our deepest gratitude to the following teams for open-sourcing their valuable Repos:

compass's People

Contributors

asarigun avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

compass's Issues

for reduce method

hi,

Thank you for providing this interesting work! I have a question regarding the usage of the "reduce" method.. It seems that the developer only supports the Unix version, and I am using the Windows operating system which may not be compatible.

while I try to run the compass script without reduce method, the following warning info:

    print("***** WARNING: reduce is not installed      *****")
    print("***** WARNING: clashes and interaction fingerprinting may not work *****")
    print("***** WARNING: we highly recommend using reduce as in the paper for comparison *****")
    print("***** WARNING: Install instructions in README.md *****")

If I run the Compass method without using "reduce" method will it still function correctly? Will it affect the accuracy and precision of the predicted results of docking?

many thanks,

Best,

AttributeError: type object 'Molecule' has no attribute 'from_file'

hi,

please see the below error.

Traceback (most recent call last):
  File "c:\Users\lsy\anaconda3\envs\diffdock112\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\Users\lsy\anaconda3\envs\diffdock112\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "d:\Cheminfo_Workshop\5_Docking_Lab\Compass-main\main_inference.py", line 327, in <module>
    main()
  File "d:\Cheminfo_Workshop\5_Docking_Lab\Compass-main\main_inference.py", line 324, in main
    recursive_docking_and_processing(args)
  File "d:\Cheminfo_Workshop\5_Docking_Lab\Compass-main\main_inference.py", line 87, in recursive_docking_and_processing
    binding_aff, clashes, strain, confidence_value = process_sdf_file(write_dir, sdf_file, args, protein_path_list, iteration, ligand_description)
  File "d:\Cheminfo_Workshop\5_Docking_Lab\Compass-main\main_inference.py", line 168, in process_sdf_file
    clashes, strain, inter_dict = posecheck_eval(protein_path, input_sdf_path)
  File "d:\Cheminfo_Workshop\5_Docking_Lab\Compass-main\compass.py", line 83, in posecheck_eval
    pc.load_protein_from_pdb(protein_file)
  File "d:\cheminfo_workshop\5_docking_lab\compass-main\posecheck\posecheck\posecheck.py", line 59, in load_protein_from_pdb
    self.protein = load_protein_from_pdb(pdb_path, reduce_path=self.reduce_path)
  File "d:\cheminfo_workshop\5_docking_lab\compass-main\posecheck\posecheck\utils\loading.py", line 85, in load_protein_from_pdb
    prot = plf.Molecule.from_file(tmp_path)
AttributeError: type object 'Molecule' has no attribute 'from_file'

I used the ProLIF version 2.0.3 as environment.yml depicted. however, the current version of ProLIF has remove the 'from_file" function from the Molecule class. May I know which right version of ProLIF you are using that still can allow for the proper usage of 'from_file" function?

many thanks,

Best,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.