Git Product home page Git Product logo

openff-fragmenter's Introduction

Fragmenter

Test Status Documentation Status codecov License: MIT Software DOI Paper DOI

A package for fragmenting molecules for quantum mechanics torsion scans.

More information about using this package and its features can be found in the documentation.

Warning: This code is currently experimental and under active development. If you are using this code, be aware that it is not guaranteed to provide the correct results, and the API can change without notice.

Installation

The package and its dependencies can be installed using the conda package manager:

conda install -c conda-forge openff-fragmenter

Getting Started

We recommend viewing the getting started example in a Jupyter notebook. This full example can be found here.

Here will will show how a drug-like molecule can be fragmented using this framework, and how those fragments can be easily visualised using its built-in helper utilities.

To begin with we load in the molecule to be fragmented. Here we load Cobimetinib directly using its SMILES representation using the Open Force Field toolkit:

from openff.toolkit.topology import Molecule

parent_molecule = Molecule.from_smiles(
    "OC1(CN(C1)C(=O)C1=C(NC2=C(F)C=C(I)C=C2)C(F)=C(F)C=C1)[C@@H]1CCCCN1"
)

Next we create the fragmentation engine which will perform the actual fragmentation. Here we will use the recommended WBOFragmenter with default options:

from openff.fragmenter.fragment import WBOFragmenter

frag_engine = WBOFragmenter()
# Export the engine's settings directly to JSON
frag_engine.json()

Use the engine to fragment the molecule:

result = frag_engine.fragment(parent_molecule)
# Export the result directly to JSON
result.json()

Any generated fragments will be returned in a FragmentationResult object. We can loop over each of the generated fragments and print both the SMILES representation of the fragment as well as the map indices of the bond that the fragment was built around:

for fragment in result.fragments:
    print(f"{fragment.bond_indices}: {fragment.smiles}")

Finally, we can visualize the produced fragments:

from openff.fragmenter.depiction import depict_fragmentation_result

depict_fragmentation_result(result=result, output_file="example_fragments.html")

Copyright

Copyright (c) 2018, Chaya D. Stern

Acknowledgements

Project based on the Computational Molecular Science Python Cookiecutter version 1.5.

openff-fragmenter's People

Contributors

chayast avatar dependabot[bot] avatar dgasmith avatar j-wags avatar jchodera avatar jthorton avatar mattwthompson avatar pre-commit-ci[bot] avatar simonboothroyd avatar yoshanuikabundi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openff-fragmenter's Issues

Return parent fragment mapping

During the QCSubmit call today we talked about validating highlighted dihedrals for torsiondrives and found that it would good to check the dihedral is correctly identified by returning a mapping between each fragment and the parent.

Use xtb for the WBO

At the free energy workshop, it was mentioned that Cresset is now using xtb to calculate the WBO for fragmentation as they do not have access to openeye. This seems to be more reliable than the use of ambertools SQM and even fixed some fragmentation issues they saw with large ligands such as those in the ptp1b (I think this was the set). This would probably need to be a toolkit wrapper to work but I am not sure if we would want to support that fully although the functionality should be quite minimal similar to the ambertools wrapper.

Fragmentation API

The API for fragmenting is a little awkward:

f = fragmenter.fragment.WBOFragmenter(mol, functional_groups=None, verbose=False) # some of the options go here
f.fragment(threshold=0.01, keep_non_rotor_ring_substituents=True, heuristic='path_length', ...) # some of the options go here
for bond in f.fragments:
    # use fragments
    ...

Instead of making the Fragmenter object only operate on a single molecule, and splitting all the options that control its behavior across the constructor (WBOFragmenter()) and the fragment() method, what about making this a factory?

You could still produce an object for each molecule that had all the information about the fragments, like FragmentSet. Using this API would look something like this:

# Create the fragmenter factory
f = fragmenter.fragment.WBOFragmenter()
# Configure the factory for any non-default options you want to set
f.functonal_groups = ...
f.threshold = 0.01
f.keep_non_rotor_ring_substituents = False
# Fragment one or more molecules
for oemol in oemols:
    fragment_set = f.fragment(mol)
    # do something with the fragments
    for bond in fragment_set.fragments:
        # use fragments
        ...

Flag torsions in linear molecules

torsiondrive chokes on dihedrals where one of the angles are linear. For example in C#CC (prop-1-yne), the methyl rotor H-C-C3-C3 has the same energy for all torsion angles. For any torsion 1234, check that angles 123 or 234 are not linear.

Move openeye imports inside functions that need them

By importing openeye at top level, we guarantee that the fragmenter package cannot be imported if the openeye toolkits are installed even if we are using components of fragmenter that do not require openeye. This also causes problems for testing packages built and deployed via omnia.

To enable package testing and use of the non-openeye components, we can move the import openeye statement inside of methods that actually need it.

Omega returned error code 0

Hi,

I am trying to fragment a molecule. I have a simple script that has worked for the other molecules that I've fragmented but when I try to run this molecule (C[CH]1CN=C(N1)C2=CC=C(NC(=O)NC3=CC=C(C=C3)C(=O)NC4=CC=C(C=C4)C5=NCHCN5)C=C2) it gives me the following error:


runfile('/Users/emmawu/Downloads/fragment_molecules.py', wdir='/Users/emmawu/Downloads')
Warning: ~{N}-[4-(4-methyl-4,5-dihydro-1~{H}-imidazol-2-yl)phenyl]-4-[[4-(5-methyl-4,5-dihydro-1~{H}-imidazol-2-yl)phenyl]carbamoylamino]benzamide: Failed due to unspecified stereochemistry
Traceback (most recent call last):

File "/Users/emmawu/Downloads/fragment_molecules.py", line 17, in
file.fragment()

File "/Users/emmawu/anaconda3/lib/python3.6/site-packages/fragmenter/fragment.py", line 873, in fragment
self.calculate_wbo()

File "/Users/emmawu/anaconda3/lib/python3.6/site-packages/fragmenter/fragment.py", line 900, in calculate_wbo
self.molecule = get_charges(self.molecule)

File "/Users/emmawu/anaconda3/lib/python3.6/site-packages/fragmenter/chemi.py", line 75, in get_charges
charged_copy = generate_conformers(molecule, max_confs=max_confs, strict_stereo=strict_stereo, **kwargs) # Generate up to max_confs conformers

File "/Users/emmawu/anaconda3/lib/python3.6/site-packages/fragmenter/chemi.py", line 171, in generate_conformers
raise(RuntimeError("omega returned error code %d" % status))

RuntimeError: omega returned error code 0


Here is the code:
m = chemi.smiles_to_oemol('C[CH]1CN=C(N1)C2=CC=C(NC(=O)NC3=CC=C(C=C3)C(=O)NC4=CC=C(C=C4)C5=N[CH](C)CN5)C=C2')
file = fragment.WBOFragmenter(m)
file.fragment()
name = '67735'
file.depict_fragments(fname="{}.pdf".format(name))
with open ('{}.json'.format(name), 'w') as f:
json.dump(file.to_json(), f , indent=2, sort_keys=True)

with open ('{}.json'.format(name), 'r') as f:
data = list(dict(json.load(f)).keys())

with open ('{}.smi'.format(name), 'w') as f:
f.writelines([x + "\n" for x in data])

As always, thank you so much for any help and input!

Best,
Emma

Charging molecules while only returning the original coordinates returns a wonky molecule

When charging a molecule, and then only returning the original coordinates using the default options of chemi.get_charges, returns a molecule with coordinates that are wonky. This does not happen when the option to return all Omega generated conformers is used chemi.get_charges(mol, keep_confs=-1)

This is what a 2D rendition of the wonky molecule. This may also be the reason why steroechemistry flips during fragmentation or charging fails in the next round.
image

Grow fragments when stereochemistry can not be fixed

In some cases, fragmenter will stop and raise an error if a fragment molecules stereochemistry can not be fixed to match the parent. It may be better, in this case, to keep growing the fragment until the stereochemistry can be matched all the way until the parent molecule is returned where there would be no issues. This would avoid downstream workflows like bespokefit from having to do this and they might miss a smaller possible fragment that does fix the issue.

An example molecule which fails mol = Molecule.from_smiles("[H][O][C]([H])([H])[C]([H])([H])[C]([H])([N]([H])[C]1=[N][C]([H])=[C]2[C](=[N]1)[N]([C]([H])([H])[H])[C](=[O])[C]([O][C]1=[C]([F])[C]([H])=[C]([F])[C]([H])=[C]1[H])=[C]2[H])[C]([H])([H])[C]([H])([H])[O][H]",)

Support openforcefield.topology.Molecule?

A bunch of functions in cmiles and fragmenter support a random/arbitrary subset of QCSchema JSON, QCSchema dict, SMILES, OEMol, and RDKit Mol.

It would be very helpful to standardize support for all of these through the openforcefield.topology.Molecule class, which is now stable enough to use as a standard exchange format.

Fragment JSON

After working with an example Fragment JSON, the following would be good to change:

  • dihedrals labels should start from zero, not one.
  • Is it possible to have a "canonical" dihedral order?
  • Connection graphs for molecules should be provided in [[index1, index2, bond order], ...
  • Coordinates should be in Bohr, not angstrom (0.52917720859 Bohr to angstroms)
  • Dihedrals should always be double list [[0, 1, 2, 3]]
  • It was mentioned that we will likely have a new starting molecule per dihedral. The overall structure might want to alter to reflect that.

Water example:

{
    "name": "HOOH",
    "geometry": [1.848671612718783, 1.4723466699847623, 0.6446435664312682, 1.3127881568370925, -0.1304193792618355, -0.2118922703584585, -1.3127927010942337, 0.1334187339129038, -0.21189641512867613, -1.8386801669381663, -1.482348324549995, 0.6446369709610646
    ],
    "symbols": ["H", "O", "O", "H"],
    "connectivity": [[0, 1, 1], [1, 2, 1], [2, 3, 1]]
}

Provide a way to map from `WBOFragmenter` input molecule to parent molecule

Summary

Many users probably want to follow specific atoms/bonds through the fragmentation process. However, as a result of performing canonicalization of input molecules, there's no way to map from an atom in the input molecule of a Fragmenter job to an atom in the result's parent_molecule or any of the output fragments.

Reproducing example

from openff.toolkit.topology import Molecule
from openff.fragmenter.fragment import WBOFragmenter
mol1 = Molecule.from_smiles('ClCCCF')
frag_engine = WBOFragmenter()
result = frag_engine.fragment(mol1)

def draw_mol_and_label_atom_index(mol):
    rdmol = mol.to_rdkit()
    for atom in rdmol.GetAtoms():
        atom.SetAtomMapNum(atom.GetIdx())
    return rdmol

mol1 has the following atom map, and the drawing below shows the atom indices (NOT map indices)

print(mol1.properties)
draw_mol_and_label_atom_index(mol1)

{'atom_map': {0: 1,
1: 2,
2: 3,
3: 4,
4: 5,
5: 6,
6: 7,
7: 8,
8: 9,
9: 10,
10: 11}}

image

The result's parent_molecule has the following atom map, and the drawing below shows the atom indices (NOT map indices)

result.parent_molecule.properties
draw_mol_and_label_atom_index(result.parent_molecule)

{'atom_map': {0: 1,
1: 3,
2: 5,
3: 4,
4: 2,
5: 8,
6: 9,
7: 10,
8: 11,
9: 6,
10: 7}}

image

We see that the Cl switched from being atom index 0 in the input, to atom index 4 in the parent. There's no direct way to map from 0 to 4 given the information in the result object.

Solutions

I think the root cause of this issue is that the OpenFF toolkit performs canonicalization, but it doesn't have a way to return the atom mapping that got applied. I'll open an issue on the Toolkit repo to have that mapping get returned, and once that's available we can provide it to the user here.

How to handle multiple starting conformation for coupled torsions

#28 adds the ability to start torsiondrive from multiple starting conformations. It does this by driving all dihedrals provided to genereate_grid_conformers.

There are several open questions:

  1. The reason I added the grid conformer generator was because I couldn't get omega to generate conformers by driving protons but I should be able to do it.
  2. How should we handle coupled torsions?
    As seen in the ethylene glycol example, the torsion profile looks very different if one or several starting conformations are used. torsiondrive will find the low energy path, but it's not clear that this is the profile we should use to fit current torsion functionals that do not consider torsion correlations.
    If to fit current torsion parameters we should minimize rotation around other torsions, which profile should we use if the profiles are so different with respect to other degrees of freedom?

Fragments have missing stereochemistry

In some cases, fragments have missing stereochemistry despite it being fully defined for the parent molecule and results in an error when creating the torsiondrive json after successful fragmentation. This causes issues for QCSubmit as this is the only way to get the mapping between the parent and fragment but will be fixed once the mapping is returned in the fragmenter update.

Segmentation Fault

Hi,
I am new to fragmenter and I am trying to learn how to use it. I have installed fragmenter through conda. I am trying to run the example code, fragment_molecules.py, in the terminal but it gives me the following error:


Fatal Python error: Segmentation fault

Current thread 0x0000000119aa0dc0 (most recent call first):
File "", line 219 in _call_with_frames_removed
File "", line 922 in create_module
File "", line 571 in module_from_spec
File "", line 658 in _load_unlocked
File "", line 955 in _find_and_load_unlocked
File "", line 971 in _find_and_load
File "/Users/emmawu/opt/anaconda3/lib/python3.6/site-packages/rdkit/init.py", line 2 in
File "", line 219 in _call_with_frames_removed
File "", line 678 in exec_module
File "", line 665 in _load_unlocked
File "", line 955 in _find_and_load_unlocked
File "", line 971 in _find_and_load
File "/Users/emmawu/opt/anaconda3/lib/python3.6/site-packages/cmiles/utils.py", line 10 in
File "", line 219 in _call_with_frames_removed
File "", line 678 in exec_module
File "", line 665 in _load_unlocked
File "", line 955 in _find_and_load_unlocked
File "", line 971 in _find_and_load
File "/Users/emmawu/opt/anaconda3/lib/python3.6/site-packages/cmiles/generator.py", line 8 in
File "", line 219 in _call_with_frames_removed
File "", line 678 in exec_module
File "", line 665 in _load_unlocked
File "", line 955 in _find_and_load_unlocked
File "", line 971 in _find_and_load
File "/Users/emmawu/opt/anaconda3/lib/python3.6/site-packages/cmiles/init.py", line 11 in
File "", line 219 in _call_with_frames_removed
File "", line 678 in exec_module
File "", line 665 in _load_unlocked
File "", line 955 in _find_and_load_unlocked
File "", line 971 in _find_and_load
File "", line 219 in _call_with_frames_removed
File "", line 941 in _find_and_load_unlocked
File "", line 971 in _find_and_load
File "/Users/emmawu/opt/anaconda3/lib/python3.6/site-packages/fragmenter/fragment.py", line 2 in
File "", line 219 in _call_with_frames_removed
File "", line 678 in exec_module
File "", line 665 in _load_unlocked
File "", line 955 in _find_and_load_unlocked
File "", line 971 in _find_and_load
File "", line 219 in _call_with_frames_removed
File "", line 1023 in _handle_fromlist
File "/Users/emmawu/opt/anaconda3/lib/python3.6/site-packages/fragmenter/init.py", line 12 in
File "", line 219 in _call_with_frames_removed
File "", line 678 in exec_module
File "", line 665 in _load_unlocked
File "", line 955 in _find_and_load_unlocked
File "", line 971 in _find_and_load
File "frag_test.py", line 8 in
zsh: segmentation fault python -q -X faulthandler frag_test.py


I am using the following python: Python 3.6.10 :: Anaconda, Inc.

The command I use to run the script is python frag_test.py
(I just copied and pasted the code from fragment_molecules.py to frag_test.py)

I am not sure how to fix this error. Any input is appreciated! Thanks!

`find_rotatable_bonds` fails unless an atom map is present

This fails:

from openff.toolkit.topology import Molecule
from openff.fragmenter.fragment import WBOFragmenter
frag_engine = WBOFragmenter()
offmol = Molecule.from_smiles('CCCC')
tbs = '[#6:1]~[#6:2]'
frag_engine.find_rotatable_bonds(offmol, 
                                 target_bond_smarts=[tbs])

yields

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/var/folders/kc/z88p9wb140727hwlvbzybd2r0000gn/T/ipykernel_42374/314705335.py in <module>
     21 print(tbs)
     22 print(offmol.properties)
---> 23 frag_engine.find_rotatable_bonds(offmol, 
     24                                  target_bond_smarts=[tbs])

~/miniconda3/envs/2021-bespokefit-blog-post/lib/python3.8/site-packages/openff/fragmenter/fragment.py in find_rotatable_bonds(cls, molecule, target_bond_smarts)
    354             }
    355 
--> 356         return [
    357             (
    358                 get_map_index(molecule, match[0]),

~/miniconda3/envs/2021-bespokefit-blog-post/lib/python3.8/site-packages/openff/fragmenter/fragment.py in <listcomp>(.0)
    356         return [
    357             (
--> 358                 get_map_index(molecule, match[0]),
    359                 get_map_index(molecule, match[1]),
    360             )

~/miniconda3/envs/2021-bespokefit-blog-post/lib/python3.8/site-packages/openff/fragmenter/utils.py in get_map_index(molecule, atom_index, error_on_missing)
     62 
     63     if atom_map_index is None and error_on_missing:
---> 64         raise KeyError(f"{atom_index} is not in the atom map ({atom_map}).")
     65 
     66     return 0 if atom_map_index is None else atom_map_index

KeyError: '0 is not in the atom map ({}).'

But adding a trivial atom map succeeds:

from openff.toolkit.topology import Molecule
from openff.fragmenter.fragment import WBOFragmenter
frag_engine = WBOFragmenter()
offmol = Molecule.from_smiles('CCCC')
offmol.properties["atom_map"] = {i: i for i in range(offmol.n_atoms)}
tbs = '[#6:1]~[#6:2]'
frag_engine.find_rotatable_bonds(offmol, 
                                 target_bond_smarts=[tbs])

yields

[(0, 1), (1, 2), (2, 3)]

We should either document that WBOFragmenter.find_rotatable_bonds requires an atom map, or make it work on molecules without atom maps.

Wiberg Order Not Found

Hi,
I am having issues running the following simple script:
from fragmenter import fragment, chemi

m= chemi.smiles_to_oemol('OC1(CN(C1)C(=O)C1=C(NC2=C(F)C=C(I)C=C2)C(F)=C(F)C=C1)[C@@H]1CCCCN1')
f = fragment.WBOFragmenter(m)
f.fragment()
f.depict_fragments(fname="example.pdf")

Here is the error I am getting:


Traceback (most recent call last):

File "/Users/emmawu/Downloads/frag_test.py", line 23, in
frag_engine.fragment()

File "/Users/emmawu/anaconda3/lib/python3.6/site-packages/fragmenter/fragment.py", line 874, in fragment
self._get_rotor_wbo()

File "/Users/emmawu/anaconda3/lib/python3.6/site-packages/fragmenter/fragment.py", line 958, in _get_rotor_wbo
self.rotors_wbo[bond] = b.GetData('WibergBondOrder')

File "/Users/emmawu/anaconda3/lib/python3.6/site-packages/openeye/oechem.py", line 11734, in GetData
return _oechem.OEBase_GetData(self, *args)

ValueError: GetData: WibergBondOrder not found.


I get the same error when trying to run the example script. I am not entirely sure what is causing the error. Any feedback is appreciated! Thank you!

Best,
Emma

Fragmentation issue because of stereochem when the bond involves Sulfur

Hi,

I bumped into the following issue while processing this sulfoxide compound ("O=S(c1ccc(Nc2nc(OCC3CCCCC3)c4c([nH]cn4)n2)cc1)C"). Dissecting the code looks like it builds fragments around all rotatable bonds without issue but raises the error when the bond involves sulfur. Please let me know if I am missing something trivial or if this is an edge case.

I also tried without specifying allow_undefined_stereo=True while loading the molecule but that immediately returned UndefinedStereochemistryError

UndefinedStereochemistryError: Unable to make OFFMol from RDMol: Unable to make OFFMol from SMILES: RDMol has unspecified stereochemistry. Undefined chiral centers are:
 - Atom S (index 1)

I am using fragmenter version: '0.1.2'

sample_mol = Molecule.from_smiles("O=S(c1ccc(Nc2nc(OCC3CCCCC3)c4c([nH]cn4)n2)cc1)C", allow_undefined_stereo=True)
result = frag_engine.fragment(sample_mol)
Warning (not error because allow_undefined_stereo=True): Unable to make OFFMol from RDMol: RDMol has unspecified stereochemistry. RDMol name: Undefined chiral centers are:
 - Atom S (index 18)

Warning (not error because allow_undefined_stereo=True): Unable to make OFFMol from RDMol: RDMol has unspecified stereochemistry. Undefined chiral centers are:
 - Atom S (index 7)

A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
Warning (not error because allow_undefined_stereo=True): Unable to make OFFMol from RDMol: RDMol has unspecified stereochemistry. Undefined chiral centers are:
 - Atom S (index 8)

A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
Warning (not error because allow_undefined_stereo=True): Unable to make OFFMol from RDMol: RDMol has unspecified stereochemistry. Undefined chiral centers are:
 - Atom S (index 11)

A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
Warning (not error because allow_undefined_stereo=True): Unable to make OFFMol from RDMol: RDMol has unspecified stereochemistry. Undefined chiral centers are:
 - Atom S (index 12)

A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
Warning (not error because allow_undefined_stereo=True): Unable to make OFFMol from RDMol: RDMol has unspecified stereochemistry. Undefined chiral centers are:
 - Atom S (index 13)

A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
Warning (not error because allow_undefined_stereo=True): Unable to make OFFMol from RDMol: RDMol has unspecified stereochemistry. Undefined chiral centers are:
 - Atom S (index 1)

A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
A new stereocenter formed at atom 19
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-e154dd498291> in <module>
----> 1 result = frag_engine.fragment(cdk2_31_from_smi)

~/miniconda3/envs/py39/lib/python3.9/site-packages/openff/fragmenter/fragment.py in fragment(self, molecule, target_bond_smarts, toolkit_registry)
    914         with global_toolkit_registry(toolkit_registry):
    915
--> 916             result = self._fragment(molecule, target_bond_smarts)
    917
    918             result.provenance["toolkits"] = [

~/miniconda3/envs/py39/lib/python3.9/site-packages/openff/fragmenter/fragment.py in _fragment(self, molecule, target_bond_smarts)
   1017         wbo_rotor_bonds = self._get_rotor_wbo(molecule, rotatable_bonds)
   1018
-> 1019         fragments = {
   1020             bond: self._build_fragment(
   1021                 molecule,

~/miniconda3/envs/py39/lib/python3.9/site-packages/openff/fragmenter/fragment.py in <dictcomp>(.0)
   1018
   1019         fragments = {
-> 1020             bond: self._build_fragment(
   1021                 molecule,
   1022                 stereochemistry,

~/miniconda3/envs/py39/lib/python3.9/site-packages/openff/fragmenter/fragment.py in _build_fragment(cls, parent, parent_stereo, parent_groups, parent_rings, bond_tuple, parent_wbo, threshold, heuristic, cap, **kwargs)
   1202         while fragment is not None and wbo_difference > threshold:
   1203
-> 1204             fragment, has_new_stereocenter = cls._add_next_substituent(
   1205                 parent,
   1206                 parent_stereo,

~/miniconda3/envs/py39/lib/python3.9/site-packages/openff/fragmenter/fragment.py in _add_next_substituent(cls, parent, parent_stereo, parent_groups, parent_rings, atoms, bonds, target_bond, heuristic)
   1407             neighbour_atom_and_bond = cls._select_neighbour_by_wbo(parent, atoms)
   1408         elif heuristic == "path_length":
-> 1409             neighbour_atom_and_bond = cls._select_neighbour_by_path_length(
   1410                 parent, atoms, target_bond
   1411             )

~/miniconda3/envs/py39/lib/python3.9/site-packages/openff/fragmenter/fragment.py in _select_neighbour_by_path_length(cls, molecule, atoms, target_bond)
   1262         target_indices = [get_atom_index(molecule, atom) for atom in target_bond]
   1263
-> 1264         path_lengths_1, path_lengths_2 = zip(
   1265             *(
   1266                 (

ValueError: not enough values to unpack (expected 2, got 0)

Refactor into an OpenFF namespace

Description

As this is a key OpenFF package it would seem to make sense for it to live in the openff namespace, and have the package renamed to openff-fragmenter to match the other key OpenFF packages.

@ChayaSt @j-wags would you have any objections here?

Stereochemistry

How should we handle stereochemistry?

The phase angle in the Fourier series is restricted to 0 or 180. This allows the same torsion types to be used for enantiomers. If that's the case, do we need to consider enantiomers different molecules?

For the time being I'll add both the canonical isomeric SMILES and a canonical SMILES to the JSON specs for fragments.

Ideas for future refactor

This is a "master" issue for tracking various ideas relating to a future refactor:

  • What shall be included
  • What shall not be included
  • ?

I believe @j-wags has some ideas for this ๐Ÿš€

I've also added a refactor tag to track other issues.

Fragmentation fails with new openeye toolkit.

During QCSubmit testing the fragmenter tests have started to fail with the release of the new openeye toolkit. See here for the logs. I will start to look into refactoring fragmenter to use the openforcefield toolkit to avoid these problems soon.

Handle different charging failures appropriately

Calculating AM1 WBOs can fail for several reasons. Some are:

  1. Missing BCC parameters
  2. Missing stereochemistry
  3. Total charge does not equal formal charge.

The current workaround is just to continue growing the fragment until the failures stop.
Tagging all the right functional groups to avoid creating fragments with weird chemistry so this doesn't happen is probably the best solution. It might also be good to handle each kind of failure appropriately.

Make fragmenters stateless

Description

Currently the fragmenter classes store state which is modified over the course of a fragmentation. This includes the parent molecule, the stereochemistry of the parent, ring systems etc. It would seem to make more sense to have the fragmenter classes be mostly stateless, whereby the object itself is initialized with only the options which will be used for each fragmentation, and the fragment method accepting only the parent molecule to fragment and to return an object containing both the fragments and provenance of how the fragmentation was performed.

This would likely be a two step change:

  1. all of the fragmenter state (except the parent molecule) is moved into the fragment method, and most of the current stateful functions changed to class methods.

  2. the API of the fragmenter classes would be changed to reflect their now stateless nature.

This is related to #49

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.