openfreeenergy / lomap Goto Github PK
View Code? Open in Web Editor NEWAlchemical mutation scoring map
Home Page: https://lomap.readthedocs.io
License: MIT License
Alchemical mutation scoring map
Home Page: https://lomap.readthedocs.io
License: MIT License
Currently Lomap is sometimes getting confused on symmetry. E.g. here's a napththalene benzoxazole transform:
In these two structure, the 6-ring on each is aligned near perfectly. The 6 membered ring is identified as the MCS, but it's being mapped to the wrong location in naphthalene.
The "choose best MCS" function here seems to make sense:
Then shifting the molecules to align first was added here:
This is what is causing the wrong ring to be picked
This repo needs to not show as a fork of the old one. Being a fork will cause confusion, and people may try to contribute to the unmaintained upstream.
Since we decided against transferring the MobleyLab repo, we should detach our fork. https://support.github.com/request/fork
We should also ask that MobleyLab archive their Lomap repo, or at the very least add a note saying that development is continuing here (ideally both).
The docs might not be complete yet, but advertising the RTD build would be useful?
error message says:
AttributeError: 'int' object has no attribute 'componentA'
Hi,
I've been trying to use the radial function of lomap and have been having issues. I am using the same set of ligands in /tests/radial and the functions in /examples/example_radial_graph.py.
import lomap
db_mol = lomap.DBMolecules('/root/Downloads/Lomap/lomap/tests/radial/', output=True, radial=True, hub = 'ejm44.mol2')
strict, loose = db_mol.build_matrices()
strict_numpy = strict.to_numpy_2D_array()
loose_numpy = loose.to_numpy_2D_array()
nx_graph = db_mol.build_graph()
# Drawing the network
import networkx as nx
import matplotlib.pyplot as plt
pos = nx.spring_layout(nx_graph)
nx.draw(nx_graph, pos)
nx.draw_networkx_labels(nx_graph, pos)
plt.show()
Running with radial=False and no hub ligand selected creates the same, non-radial network.
Lomap networks can sometimes be disconnected.
For one system from the PLB(pde2), there's a morpholine ring that slightly is shifted and lead to kartograf not mapping it and therefore the lomap score being very low. This results in two ligands from being disconnected from the rest of the network. I slightly increased the atom_max_distance
and now got a connected network, mapping the ring.
Should the generate_lomap_network
warn the user if the network is not connected?
Our network visualization (plot_atommapping_network(ligand_network)
) overlays ligand names, therefore making it very difficult to visually see when networks are not connected.
Currently we hide any errors coming from LOMAP in the atom mapper. This is so that we can feed it ligands that might not be able to be mapped at all, and it acts as a sort of filter.
But this also means that it tedious to debug any issues that pop up in LOMAP, because we just get back no mappings with no feedback as to what went wrong. To get any details, you need to drop out of the OpenFE world and into the raw LOMAP/RDKit.
Maybe we make should the errors we catch into an INFO
level log event?
It would be nice if MCS had an option to disallow element changes in proposed mappings.
from rdkit import Chem
from lomap import mcs
test_dir = "/Users/dwhs/omsf/src/openfe/openfe/tests/data/lomap_basic/"
toluene = Chem.MolFromMol2File(test_dir + "toluene.mol2")
methylcyclohexane = Chem.MolFromMol2File(test_dir + "methylcyclohexane.mol2")
mapping = mcs.MCS(toluene, methylcyclohexane)
# behaves as desired
toluene2 = Chem.MolFromSmiles(Chem.MolToSmiles(toluene))
methylcyclohexane2 = Chem.MolFromSmiles(Chem.MolToSmiles(methylcyclohexane))
mapping = mcs.MCS(toluene2, methylcyclohexane2)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~/mambaforge/envs/openfe/lib/python3.9/site-packages/lomap/mcs.py in __init__(self, moli, molj, time, verbose, max3d, threed)
586 try:
--> 587 trim_mcs_mol(max_deviation=max3d)
588 except Exception as e:
~/mambaforge/envs/openfe/lib/python3.9/site-packages/lomap/mcs.py in trim_mcs_mol(max_deviation)
156 while True:
--> 157 (mapi,mapj) = best_substruct_match_to_mcs(self._moli_noh, self._molj_noh, by_rmsd=True)
158 # Compute the translation to bring molj's centre over moli
~/mambaforge/envs/openfe/lib/python3.9/site-packages/lomap/mcs.py in best_substruct_match_to_mcs(moli, molj, by_rmsd)
121 if by_rmsd:
--> 122 coord_delta = (substructure_centre(moli,mapi)
123 - substructure_centre(molj,mapj))
~/mambaforge/envs/openfe/lib/python3.9/site-packages/lomap/mcs.py in substructure_centre(mol, mol_sub)
90 for i in mol_sub:
---> 91 sum += mol.GetConformer().GetAtomPosition(i)
92 return sum / len(mol_sub)
ValueError: Bad Conformer Id
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
/var/folders/vj/28c107496sq5z4y10rjz5gdm0000gn/T/ipykernel_30741/2330220248.py in <module>
1 toluene2 = Chem.MolFromSmiles(Chem.MolToSmiles(toluene))
2 methylcyclohexane2 = Chem.MolFromSmiles(Chem.MolToSmiles(methylcyclohexane))
----> 3 mapping = mcs.MCS(toluene2, methylcyclohexane2)
~/mambaforge/envs/openfe/lib/python3.9/site-packages/lomap/mcs.py in __init__(self, moli, molj, time, verbose, max3d, threed)
587 trim_mcs_mol(max_deviation=max3d)
588 except Exception as e:
--> 589 raise ValueError(str(e))
590
591 # Trim the MCS further to remove chirality mismatches
ValueError: Bad Conformer Id
Is it really necessary to compute the full MCS matrix for radial maps when a hub ligand is provided? It seems like you would just need to compute the array of values for the N ligands to the hub compound, or am i missing something?
This is a summary of my issue raised in the openFE developers channel on 05/02/2022:
My original message:
3:00 PM
"There is an issue with the LOMAP similarity scoring that I wanted to bring up as you all work on incorporating it into OpenFE.
On Friday, I found a bug/limitation in the existing LOMAP similarity metric. I ran my planner for the PLBenchmark sets and saw abnormally low similarity scores in a particular use case, which I will describe. The attached example is from the PLBenchmarks bace_hunt ligand set, and the ligands are shown in optimization_inputs.pdf. In output.pdf, you can see the similarity scores are all near zero in the heatmap where Distance = 1 - similarity(LOMAP). These ligands shouldn’t all have zero or near zero similarity.
I have also attached bace_hunt.zip, which includes sdf files and the LOMAP generated similarity scores (strict_numpy.csv) for your reference.
In this case, LOMAP produces poor similarities due to chirality. MCSS in LOMAP is 2D and doesn’t handle chirality properly, so it deletes chiral centers. In the bace_hunt set, the whole ring system attached to the chiral center is not considered in the similarity scoring. What remains is the aromatic ring at the top left. Since there is no variation in chirality across the rings in the series, the ring should not be deleted.
It looks like the similarity metric should be refined then to better handle MCSS and chirality."
optimization_inputs.pdf
output.pdf
bace_hunt.zip
David Mobley responded with added detail:
"...Basically, as originally implemented LOMAP had to delete chiral centers when computing MCS because I only had 2D graphs without chirality and there was no way to ensure the mapping worked out correctly.
When Mary applied this to the real sets, it turns out deleting chiral centers in several systems results in essentially no similarity so things will basically fail.
This was never intended to be a long-term solution when I first whipped this together back in maybe 2010 in like a week (LOL), more of a “I’ll just delete these until I come up with a better solution, because this is just a proof-of-concept…” and… here we are a dozen years later.
(I know OpenEye has done something better in theirs, but that’s proprietary.)"
Clara Christ provided a solution she implemented:
"...here is what I did at the time to handle stereo atoms when mapping (see below). In short, I chose the MCS of the many possible MCSs with the minimum atom-to-atom distances AND distances greater than 0.5A and atoms differing in stereochemistry were removed from the MCS (the code is with BI). https://pubs.acs.org/doi/pdf/10.1021/ci4004199 "
I had in my head that this is lomap2 on conda-forge? If so add this to the main README.md?
Also, should the repo be renamed LOMAP2 for consistency with that? (And this would also help mark the transition from the MobleyLab one to the OpenFE one.)
Current author list isn't reflecting the changes we've made in the last few months, probably should update it as such. Maybe we can even move the author list out of the readme and into a separate AUTHORS.md file?
should be able to see settings from the repr
I'm working on migrating LOMAP to setup.cfg
(and moving to pip-installability). However, the current setup.py
has some strange behavior that I think needs to change.
Currently, setup
receives packages=find_packages() + ['test']
. This means that you get 2 packages installed into your site-packages
: lomap
and test
. This copies the test files into site-packages/test/
, which:
site-packages/test/
will be whichever package last tried to install something as a package called test
.test
package, which you'll get on import test
(unless you did something freaky with your environment).Move test/
-> lomap/tests/
. It gets installed as usual. Things can be imported from lomap.tests
. This is what we do with all our other packages.
This leaves the files where they are currently, but should make tests available at lomap.tests
. Honestly, this isn't an approach I use, and my attempts to get things like this to work in the past were a bit of a headache.
We could just say that the tests have to be downloaded separately. I'm not a fan of this.
Personally, I'd vote for option 1. I don't think this counts as an API break (technically, we're moving tests to a different namespace, but since it was impossible to access tests with a default Python setup, I don't think it is a problem.)
Is this the current preferred citation for LOMAP? https://link.springer.com/article/10.1007/s10822-013-9678-y
(This repo involved rewriting the original OpenEye/Schrödinger based implementation using RDkit; not sure if there's a newer citation for that, or what the preferred citations are.)
While working with Lomap I noticed that some of the ligands in the perturbation mapping weren't part of a thermodynamic cycle, with only one edge connecting them to the rest of the network. I tried using the radial
parameter but still observed the same.
Is there a way to force this behavior so that all ligands are part of a thermodynamic cycle?
I also noticed a slight misalignment between some of my ligands (3D coordinates). Could that play a role in the scoring from Lomap since they're obtained from gufe
?
Currently a placeholder: https://lomap.readthedocs.io/en/latest/getting_started.html
mappings generated by MCS should be a single connected fragment.
There's a longstanding deprecation warning about this.
Somewhat a continuation of MobleyLab/Lomap#50
We'll do as we usually do: Sphinx docs covering the API, with room to add extra info on usage as needed.
For style/etc. we should probably point to a centralized OpenFE definition. (Not that we're too picky -- we'll just ask for PEP8). We should probably also have a centralized guide for absolute beginners -- basically links to relevant GitHub docs.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.