Git Product home page Git Product logo

lig3dlens's Introduction

Ligand-based 3D Virtual Screening

Lig3DLens

Bill Tatsis, Matt Seddon, Dan Mason, Dan O'Donovan, Gio Cincilla, Azedine Zoufir and Nath Brown

Lig3DLens performs the following tasks:

  1. Prepares a commercial compound library to be used for a VS campaign. This task involves: i) compound standardisation and ii) filtering out compounds outside a predefined range of physicochemical properties.
  2. Generates conformers for all of the compounds in the commercial library and calculates their 3D similarity (shape & electrostatics) to a reference compound.
  3. Finally, it can cluster the highest scoring hits and select a set number of representative compounds that can be ordered and tested.

Installation

python -m pip install -r requirements.txt .

Running a ligand-based 3D VS campaign

  1. Prepare a chemical library for a 3D VS campaign

Note In order to keep track of the library cmpds the input file should have a column containing the text "ID"

lig3lens-prepare --in input_SD_file --filter physchem_yaml_file --out output_SD_file
  1. Generates 3D conformers for both the library and reference compounds and scores the library compounds using a 3D shape & electrostatics similarity function to the reference molecule

Note In order to keep track of the library cmpds the input file should have a column containing the text "ID"

lig3dlens-align --ref input_reference_molecule_file --lib input_library_file_name --conf num_conformers --out output_SD_file
  1. Clusters the highest scoring molecules and selects a representative (diverse) set of compounds. The user can input the number of clusters (num_clusters), the fingerprint type (fingerprint_type) and its dimension (fingerprint_dimension) used for the clustering.
lig3dlens-cluster –-in input_SD_file –-clusters num_clusters –-out output_file -–dim fingerprint_dimension -–fp_type fingerprint_type

Example

To run lig3dlens with the data samples included in this repository;

... preparing curated compounds

lig3dlens-prepare --in tests/test_data/Enamine_hts_collection_202303_first500_VS_results.sdf \
    --filter lig3dlens/physchem_properties.yaml \
    --out curated_compounds.sd

... generate 3D conformers

lig3dlens-align --ref input_reference_molecule_file --lib input_library_file_name --conf num_conformers --out output_SD_file

... clust

lig3dlens-cluster –-in input_SD_file –-clusters num_clusters –-out output_file -–dim fingerprint_dimension -–fp_type fingerprint_type

Development

Use the Makefile commands to help tidy the codebase periodically. The following will reformat the code according to PEP8, and logically sort the imported modules:

make tidy

Tests

Run pytest in lig3dlens directory

pytest tests

Requirements

Note: The whole VS workflow was tested in a linux (ubuntu) environment and this environment variable had to be set: Tell MKL (used by NumPy) to use the GNU OpenMP runtime instead of the Intel OpenMP runtime by setting the following environment variable:

export MKL_THREADING_LAYER=GNU

The open source quantum chemistry package Psi4 is required to for the QM calculation of the partial charges. More information about installing Psi4 in different CPU architectures (arm64 included) is provided in Psi4's website

Future improvements

  • Compound library preparation:

    • Apply a set of structural filters (for example REOS or PAINS) - either remove or flag compounds.
    • Provide more autonomy to the drug designer when setting the physicochemical properties filters.
  • Compound selection:

    • Multi-parameter selection of compounds using a score function that includes the 3D score, 2D similarity to the reference compound, and the physchem properties. The aim is to get an even distribution between highly scored cmpds and other properties.
    • Select an optimal number of clusters instead of a predefined one (e.g. using Silhouette or affinity propagation methods). Alternatively, using another method for maximum score-diversity selection problem (e.g. Score Erosion algorithm).
    • Provide tools to analyse the chemical diversity of the final selection compound set.

lig3dlens's People

Contributors

danodonovan avatar bill-tatsis avatar djm-healx avatar

Stargazers

 avatar Simon Bray avatar Dave Rauchwerk avatar  avatar Jourmore avatar Abhik Seal avatar  avatar Qin WAN avatar  avatar  avatar Evert Homan avatar Cooper Jamieson avatar  avatar  avatar Andrea Scarpino avatar Peter Vrancx avatar  avatar pan xiaolin avatar  avatar  avatar Takeru Kameda avatar  avatar Pavel avatar Matt Warren avatar SallyS. avatar Andrew Marsh avatar sshy avatar  avatar Adrien H. Cerdan avatar Ho Leung Ng avatar  avatar Jennifer HY Lin avatar  avatar Alex Hermida avatar  avatar ZhangLiChuan avatar Anthony Nash avatar  avatar Brian Naughton avatar Markus Rauhalahti avatar Pablo Ricardo Arantes avatar Jameel Abduljalil avatar Pawan Kumar avatar Leela S. Dodda avatar Taka avatar Warren Thompson avatar

Watchers

 avatar  avatar

Forkers

yeungdb chemphy

lig3dlens's Issues

ValueError: Number of processes must be at least 1

When I use the

!lig3dlens-align --ref reference_mol.smi --lib filted_Enamine_hts_collection_202303_first500.sdf --conf 500 --out conf_filted_Enamine_hts_collection_202303_first500.sdf

command, I get an error:

Traceback (most recent call last):
  File "/usr/local/bin/lig3dlens-align", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/lig3dlens/main.py", line 55, in main
    run_alignment(search_config, num_conformers, output_file)
  File "/usr/local/lib/python3.10/dist-packages/lig3dlens/alignment.py", line 55, in run_alignment
    with multiprocessing.Pool(num_processes) as pool:
  File "/usr/lib/python3.10/multiprocessing/context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 205, in __init__
    raise ValueError("Number of processes must be at least 1")
ValueError: Number of processes must be at least 1

Does anyone have the same error as me? And is there any way to debug it?

Meso-ionic crash library prep

Very nice!

Meso-ionic compounds are crashing the "calculating physchem properties" of the library prep.
eg: O=C([N-]c1cn+no1)C1CCCCC1

Also is there a way to read the library at this stage in .smi format rather than .sdf?

Thanks

Allow use of predetermined reference ligand conformation

Hi,

Thank you for making this code publicly available. I just started to play with it. What I understand is that during the alignment step, conformers are generated for both the reference ligand and the library to screen. Would it be possible to add an option to skip this for the reference ligand? This is relevant one has a protein-bound conformation of a reference ligand, in such cases you would want to align the library to that particular conformation.

Best wishes/Evert

AttributeError: module 'pandas.io.formats.format' has no attribute 'get_adjustment'

Hi,

I tried to install on another machine (Ubuntu 22.04) in a new conda environment. When running the example I get:

lig3dlens-prepare --in tests/test_data/Enamine_hts_collection_202303_first500_VS_results.sdf     --filter lig3dlens/physchem_properties.yaml     --out curated_compounds.sd
2024-04-30 13:45:17.177 | INFO     | lig3dlens.prep_cmpd_library:main:119 - Initialising the compound library preparation workflow
2024-04-30 13:45:17.177 | INFO     | lig3dlens.prep_cmpd_library:main:125 - Loading cmpds from tests/test_data/Enamine_hts_collection_202303_first500_VS_results.sdf to a Pandas dataframe
Traceback (most recent call last):
  File "/home/evehom/miniconda3/envs/lig3dlens/bin/lig3dlens-prepare", line 8, in <module>
    sys.exit(main())
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/site-packages/lig3dlens/prep_cmpd_library.py", line 128, in main
    mols_lib = dm.read_sdf(input_cmpd_lib, as_df=True, mol_column="ROMol")
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/site-packages/datamol/__init__.py", line 186, in __getattr__
    mod = importlib.import_module(obj_mod)
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/site-packages/datamol/io.py", line 16, in <module>
    from rdkit.Chem import PandasTools
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/site-packages/rdkit/Chem/PandasTools.py", line 653, in <module>
    InstallPandasTools()
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/site-packages/rdkit/Chem/PandasTools.py", line 622, in InstallPandasTools
    PandasPatcher.patchPandas()
  File "/home/evehom/miniconda3/envs/lig3dlens/lib/python3.10/site-packages/rdkit/Chem/PandasPatcher.py", line 263, in patchPandas
    if getattr(pandas_formats.format, get_adjustment_name) != patched_get_adjustment:
AttributeError: module 'pandas.io.formats.format' has no attribute 'get_adjustment'

Any clues what is causing this? Thank you,

Evert

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.