Comments (6)
I presume we currently want to use the canonical isomeric SMILES string as our way of judging whether fragments are unique. Later on, we may relax that.
from openff-fragmenter.
For the incoming molecules to fragment - should we enforce isomeric SMILES? If it's not isomeric, should we generate an isomeric SMILES before we generate conformations of the molecule to fragment?
In general I need to add a cleanup step for incoming molecules - generating an isomeric SMILES can be part of that step. However, shouldn't we also enumerate enantiomers if several stereocenters exist? The fragments can be different for different enantiomers.
from openff-fragmenter.
To clarify here:
- For incoming molecules, we need one stage that expands ambiguous stereochemistry and enumerates likely protonation/tautomeric states. At this point, all molecules will then have specific stereochemistry and explicit hydrogens.
- We should expand the JSON generated to specify both explicit-hydrogen canonical isomeric SMILES and non-isomeric SMILES so that either can be used as a key for indexing.
- The resulting molecules can then be distributed in parallel to another function, which fragments each of these molecules and passes fragments on to the next stage.
from openff-fragmenter.
We should expand the JSON generated to specify both explicit-hydrogen canonical isomeric SMILES and non-isomeric SMILES so that either can be used as a key for indexing.
@jchodera, The reason we want to expand the JSON to include explicit-hydrogen canonical SMILES is to avoid ambiguity for charged states. However, when the charged states are expanded, explicit H are added where needed.So is the explicit-hydrogen SMILES redundant?
Example for a positively charged Imatinib:
Cc1ccc(cc1Nc2nccc(n2)c3ccc[nH+]c3)NC(=O)c4ccc(cc4)CN5CCN(CC5)C
This is what it looks like with explicit hydrogens:
[H]c1c(c(c([n+](c1[H])[H])[H])c2c(c(nc(n2)N([H])c3c(c(c(c(c3C([H])([H])[H])[H])[H])N([H])C(=O)c4c(c(c(c(c4[H])[H])C([H])([H])N5C(C(N(C(C5([H])[H])([H])[H])C([H])([H])[H])([H])[H])([H])[H])[H])[H])[H])[H])[H])[H]
Currently each fragment has 5 SMILES associated with it:
- Explicit hydrogen index tagged SMILES (canonical and isomeric)
- canonical isomeric SMILES
- canonical isomeric explicit hydrogen SMILES
- canonical SMILES
- canonical explicit hydrogen SMILES.
Once the database starts getting larger, this might become too large to maintain.
from openff-fragmenter.
I think all five will be useful! But it's possible 3 and 5 are redundant if the tautomeric and charge state are both encoded by 2 and 4. This is probably a question for @bannanc or @cbayly13
from openff-fragmenter.
Stereoisomer (cis/trans, R/S) enumeration was addressed with #10.
from openff-fragmenter.
Related Issues (20)
- Grow fragments when stereochemistry can not be fixed
- fragmented molecule atom mapping rearrangement HOT 3
- Fragmentation issue because of stereochem when the bond involves Sulfur HOT 3
- `find_rotatable_bonds` fails unless an atom map is present HOT 1
- Provide a way to map from `WBOFragmenter` input molecule to parent molecule HOT 1
- Use xtb for the WBO HOT 2
- Skip OpenEye functions if license not found, even if installed
- Support openforcefield.topology.Molecule? HOT 2
- Functional groups on rings do not come along with ring HOT 1
- Handle different charging failures appropriately
- Charging molecules while only returning the original coordinates returns a wonky molecule HOT 1
- Return parent fragment mapping HOT 2
- Ideas for future refactor HOT 1
- Fragmentation fails with new openeye toolkit. HOT 2
- Fragments have missing stereochemistry HOT 2
- Segmentation Fault HOT 5
- Wiberg Order Not Found HOT 2
- Omega returned error code 0 HOT 1
- Refactor into an OpenFF namespace
- Make fragmenters stateless
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openff-fragmenter.