Git Product home page Git Product logo

denoptim-project / denoptim Goto Github PK

View Code? Open in Web Editor NEW
33.0 4.0 10.0 170.79 MB

DENOPTIM is a software package for de novo design and virtual screening of functional molecules of any kind.

License: GNU Affero General Public License v3.0

Shell 3.54% Java 81.33% Python 0.33% Roff 4.19% Jupyter Notebook 10.60% Batchfile 0.01%
molecular-design molecular-modeling computational-chemistry de-novo-design virtual-screening genetic-algorithm combinatorial

denoptim's People

Contributors

dependabot[bot] avatar dgrell avatar einarkjellback avatar marcellocostamagna avatar marco-foscato avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

denoptim's Issues

External tasks within internal fitness provider

The internal fitness provider needs to be coupled with an external molecular modeling task. This way we could get decent geometries to be used with descriptors calculated in the internal fitness provider AND use the expression of the fitness from within the internal fitness provider.

Also, this should be coupled the possibility of reading descriptors from the sdf file resulting from the external task. These descriptors should be defined in the GUI besides the atom-specific descriptors.

Separate responsibility of Fragment and IAtomContainer

Fragment is essentially supposed to work as an Adapter [https://en.wikipedia.org/wiki/Adapter_pattern] for an IAtomContainer. The responsibility of the IAtomContainer is in representing a molecule and the Fragment's responsibility is to allow the IAtomContainer to interact properly with the graph model that DENOPTIM uses. This would suggest that IAtomContainer should be provided as a dependency injection to the Fragment, i.e. you finish building the IAtomContainer before providing it to the Fragment's constructor. This further supported by the fact that after the IAtomContainer is initialized there is, as of writing, no place in the code where it is modified any further.

The immutability of IAtomContainer should be reflected in the code by disallowing any public methods in Fragment to modify the IAtomContainer.

This is not the case however, as there are a number of methods in Fragment that modify the IAtomContainer. Most of these methods have a direct analog in the IAtomContainer's interface (e.g. removeBond(…)). There are, in my opinion, several advantages of making IAtomContainer immutable:

  1. There is a clear separation between the responsibilities of Fragment and of IAtomContainer.
  2. Reduced code bloat as identical methods in IAtomContainer and Fragment are removed.
  3. Looser coupling between Fragment and IAtomContainer. If IAtomContainer changes, Fragment has fewer methods that must be updated to reflect the changes.
  4. Disambiguating the initialization of Fragments. If Fragment in practice never modifies IAtomContainer then it shouldn't be provided as an option either.

My suggestion for fixing this is to make the IAtomContainer-field final and remove any methods in Fragment that modify IAtomContainer.

Running tests on W10

HI everyone
running suggested runAllTests.sh from git Bash shell does not complete. Can you test it?
Is it something that has to do with specific bash requirements or runAllTests.sh has been created only for Mac/Linux?
Best
andrea

Load Library of Vertexes

In the graph handler, there is a button called "Load Library of Vertexes".

  1. I think vertex should be vertices in plural form.
  2. I might be wrong, but I think the name of this button it is somewhat misleading as it takes you to the input of all the different parts of the chemical space such as the scaffold and the compatibility matrix and not only vertices.

Load Library of Vertexes

In the graph handler, there is a button called "Load Library of Vertexes".

  1. I think vertex should be vertices in plural form.
  2. I might be wrong, but I think the name of this button it is somewhat misleading as it takes you to the input of all the different parts of the chemical space such as the scaffold and the compatibility matrix and not only vertices.

Strategies for setting the probabilities of crossover, mutation and generating from scratch

Right now the probabilities of crossover, p(xover), mutation, p(mut), and generating a new molecule from scratch, p(new), is set by the user and remain unchanged throughout a run (can @marco-foscato confirm this?). This is a simple strategy that works well for many applications, but I think in DENOPTIM's case we could use a more sophisticated scheme where these probabilities are dynamically changed throughout a run. I am confident these changes will increase the performance of DENOPTIM. Here are my suggestions:

  1. Drawing inspiration from Simulated Annealing, we start with a p(mut) high and lower it as time goes on. We can also make this even more sophisticated by increasing p(mut) if we see that the gene pool is stagnating and lower it if we are improving.
  2. We use the generation of a new molecule to implement a sensible restart strategy.

Suggestion number 1 is a common way of making a GA "smarter". If the results from the Simulated Annealing-approach looks promising then we can choose to implement the more sophisticated one at a later stage. For the user this would mean that instead of setting the mutation probability he/she would set some alpha related to how fast or slow p(mut) should respond to changes.

Suggestion number 2 is both a common way of escaping local maxima and of actually making convergence faster. Here is a quote from "Handbook of Meta-Heuristics" that explains why convergence may be faster if we choose a good restart strategy:

[…] the algorithm [Greedy Randomized Adaptive Path-Relinking] finds a target solution in relatively
few iterations: about 25% of the runs take at most 101 iterations; about 50% take at
most 192 iterations; and about 75% take at most 345. However, some runs take much
longer: 10% take over 1000 iterations; 5% over 2000; and 2% over 9715 iterations.

The book goes further on to suggest that it is best to restart at regular intervals. The difficulty lies of course in choosing the length of these intervals. I suggest that we simply generate a new molecule from scratch after n successful modifications. We can find the best n by doing some experiments ourselves where we compare convergence times for different n for two or three different experiments. If the optimal n is very different between the runs then n should be set by the user. If it is not then we can hard-code this value into DENOPTIM.

Inconsistent order of list of attachment points

While working on a method I had to sort the list of attachment points belonging to the Fragments that constitute the inner graph of a Template, the code broke in several places (reported by @marco-foscato). @marco-foscato suggested that there are several places where it is assumed that the list of APs is in order of the AP ID, i.e. in the order of AP creation.

We should find a solution that prevents programmers from inadvertently changing the order of the APs.

One solution is to change the type of the data structure that stores the APs to one that enforces a particular ordering. An example of such a data structure can be a heap.

Another work-around is to let .getAttachmentPoints() return a copy of the AP list. This will incur a small performance loss, but will solve the problem. It is also generally considered good practice to return copies instead of references of objects' fields from getters.

The last possibility is to get rid of the assumption altogether and change the code where the assumption is made.

SerConverter fails converting graphs to structures

I am playing around with DENOPTIM and found an issue when using the SerConverter. It seems as if something goes wrong with the fragment space because the converted *.sdf files are messed up. Using DENOPTIM GUI, 1st loading the fragment space and 2nd loading a *.ser file works well, i.e. the structure of the molecule is reasonable.

I have this issue with my own fragments and also tested it with the PtCOLX2_FSE example/test provided in the project.

It would also be nice to have a feature in the GUI, that allows the conversion of all graphs at once.

Remove IdFragmentAndAP

The IdFragmentAndAP class made sense when denoptim used an index-based approach for identifying building blocks and attachment points.
It should be possible to remove it and replace the related index-based code with reference-based one.

From testing on Windows

For compatibility of GUI with Windows platforms:

  • escape the initial \ of file names for tmp file location.
  • read-in of parameters for fitness must check the corresponding check-box of the print-out of those parameters will not work

test t7

Running t7 on MacOS Darwin.

dg_96 is the following:
96 1_1_0_-1,2_4_1_0,8_4_1_0,3_3_1_0,7_1_2_0,17_1_2_1,14_2_1_1,18_1_2_1,10_1_2_1,11_3_1_1,151_1_2_2,152_1_2_2,153_1_2_2, 1_0_2_0_1_co:0_coa:1,1_3_8_0_1_co:0_coa:1,1_1_3_0_1_ccb:0_cca:0,1_2_7_0_1_ch:0_hyd:1,3_2_17_0_1_ch:0_hyd:1,3_1_14_0_1_ccb:0_ATminus:0,3_3_18_0_1_ch:0_hyd:1,2_1_10_0_1_cob:1_hyd:1,8_1_11_0_1_cob:1_cca:0,11_1_151_0_1_ccb:0_hyd:1,11_2_152_0_1_ch:0_hyd:1,11_3_153_0_1_ch:0_hyd:1, => 96 16 [3, 0, 1]

which is properly converted to SDF file:

CDK 0625191250

13 12 0 0 0 0 0 0 0 0999 V2000
0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0001 0.0001 0.0001 O 0 0 0 0 0 0 0 0 0 0 0 0
0.0001 0.0001 0.0001 O 0 0 0 0 0 0 0 0 0 0 0 0
0.0001 0.0001 0.0001 C 0 0 0 0 0 0 0 0 0 0 0 0
0.1930 -2.4982 1.1563 H 0 0 0 0 0 0 0 0 0 0 0 0
0.1930 -2.4982 1.1563 H 0 0 0 0 0 0 0 0 0 0 0 0
0.0001 0.0001 0.0001 ATM 0 0 0 0 0 0 0 0 0 0 0 0
0.1930 -2.4982 1.1563 H 0 0 0 0 0 0 0 0 0 0 0 0
0.1930 -2.4982 1.1563 H 0 0 0 0 0 0 0 0 0 0 0 0
0.0001 0.0001 0.0001 C 0 0 0 0 0 0 0 0 0 0 0 0
0.1930 -2.4982 1.1563 H 0 0 0 0 0 0 0 0 0 0 0 0
0.1930 -2.4982 1.1563 H 0 0 0 0 0 0 0 0 0 0 0 0
0.1930 -2.4982 1.1563 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
1 3 1 0 0 0 0
1 4 1 0 0 0 0
1 5 1 0 0 0 0
4 6 1 0 0 0 0
4 7 1 0 0 0 0
4 8 1 0 0 0 0
2 9 1 0 0 0 0
3 10 1 0 0 0 0
10 11 1 0 0 0 0
10 12 1 0 0 0 0
10 13 1 0 0 0 0
M END

96

96 1_1_0_-1,2_4_1_0,8_4_1_0,3_3_1_0,7_1_2_0,17_1_2_1,14_2_1_1,18_1_2_1,10_1_2_1,11_3_1_1,151_1_2_2,152_1_2_2,153_1_2_2, 1_0_2_0_1_co:0_coa:1,1_3_8_0_1_co:0_coa:1,1_1_3_0_1_ccb:0_cca:0,1_2_7_0_1_ch:0_hyd:1,3_2_17_0_1_ch:0_hyd:1,3_1_14_0_1_ccb:0_ATminus:0,3_3_18_0_1_ch:0_hyd:1,2_1_10_0_1_cob:1_hyd:1,8_1_11_0_1_cob:1_cca:0,11_1_151_0_1_ccb:0_hyd:1,11_2_152_0_1_ch:0_hyd:1,11_3_153_0_1_ch:0_hyd:1,

NEW

$$$$

but runt7.sh expects SDF file with ' 16 15 0 0 . 0 0 0 0 '

Load Library of Vertexes

In the graph handler, there is a button called "Load Library of Vertexes".

  1. I think vertex should be vertices in plural form.
  2. I might be wrong, but I think the name of this button it is somewhat misleading as it takes you to the input of all the different parts of the chemical space such as the scaffold and the compatibility matrix and not only vertices.

Errors in code snippets and variable names

Several code snippets and variable names present in the documentations display escaping characters or are not formatted consistently.
Examples are:

  • cd $DENOPTIM\_HOME should appear as cd $DENOPTIM_HOME see user_manual.md:44 and many other places where this occurs.
  • $DENOPTIM\_HOME\\target\\denoptim should be $DENOPTIM_HOME\target\denoptim see user_manual.md:52
  • To terminate the GA, the keyword should be STOP_GA not __STOP_GA__. see user_manual.md:292. Similar problem with REMOVE_CANDIDATE and ADD_CANDIDATE
  • See around user_manual.md:396 for wrong formatting of subscripts. In general C_i should be displayed with the i as subscript of C.

Restart from previous run

It would be practical to be able to start a new run from a previous one, without having to prepare a starting population file, but rather just from linking to the previous run in the parameters.

Libraries of fragments

It would be nice to append a library of fragment, and allow GUIFragmentInspector to take some and append them to currently loaded library.

Undirected graph

Speculative thread on the possibility to make the graph undirected. It is at all possible?
Some notes and comments on this possibility:

  • Edges should become undirected: no more distinction between "source" and "target" ends.
  • The concept of graph's seed (root) vertex would be lost. This is used to identify scaffold, but in a undirected graph the role of the scaffolds can be lost. Meaning that the scaffolds are the starting point when building a graph, and having them isolated allows to guarantee that all graphs have a scaffold vertex (e.g., the metal center in a transition metal complex). So, with an undirected graph with no seed vertex there is need for a mechanism to ensure all graphs have a required vertex. New graph builds, as well as genetic operators must comply to this mechanism.
  • mutation and crossover on branches will have to decide which side of an undirected edge is going to be kept/edited.
  • the conversion to molecular representation using stiff 3D building blocks for ring-closing conformational reaches results in sligtly different geometries for isomorfic (undirected) graphs.
  • in visualizing the graph with JUNG2 will require multiple edge labels, and a customized renderer (vv.getRenderContext().setEdgeLabelRenderer( yourEdgeLabelRendererHere );

Bug in constructor of PathSubGraph class

While making unit test testExtractPattern_twoSeparatedRings() in the DENOPTIMGraphOperationsTest class I came across a bug. I was trying to convert the ring of a graph into a PathSubGraph of that graph, but the graph of the PathSubGraph output had an edge with a source AP that did not exist in the list of APs of the same edge's source vertex. This is a bug and should be fixed.

As a first step to fix this bug I think it will be very beneficial to rewrite the constructor so that it follows a depth-first search (DFS) scheme rather than whatever strategy it follows now. DFS is a tried and true algorithm which is familiar to most programmers and DFS is particularly well suited to this kind of task, namely finding paths between two vertices in a graph. DFS should therefore greatly increase the clarity of the method.

I will attach the hash of the last commit that reproduces this bug. To reproduce it, run the unit tests. A unit test called testExtractPattern_twoSeparatedRings() should fail. Debug accordingly.

Commit hash: 2252a45

Open Recent...

In the GUI's main toolbar, we could add an "Open Recent..." menu item that gives shortcut to the last N (10?) experiments.
Need a hidden file (~/.denoptim_recent) to store the list of links

Templates built on the fly should have highest symmetry

Say we have a template that is generated on the fly, and our desires is to append it to the library of known building blocks. Say the graph embedded in the template, let's call it Ga, is asymmetric. Such graph might be isomorphic with a symmetric graph. call it Gb. We would probably want to store Gb rather than Ga because from Ga we can build ONLY asymmetric graphs, while Gb allows to build both symmetric and asymmetric graphs.

See FRagmentSpace.addFusedRingsToFragmentLibrary()

OutOfBounds when removing atoms from fragment

java.lang.IndexOutOfBoundsException: No atom at index: 21 from removing last C in the following:

CDK 05082109083D

22 23 0 0 0 0 0 0 0 0999 V2000
-0.1837 1.8726 0.1896 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7448 2.7993 0.5502 N 0 0 0 0 0 0 0 0 0 0 0 0
-0.9543 2.4365 -0.7651 N 0 0 0 0 0 0 0 0 0 0 0 0
0.7779 3.8962 -0.4194 C 0 0 0 0 0 0 0 0 0 0 0 0
1.1804 5.3094 -0.0423 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.6532 3.8813 -0.9019 C 0 0 0 0 0 0 0 0 0 0 0 0
1.4085 3.5828 -1.2500 H 0 0 0 0 0 0 0 0 0 0 0 0
1.0505 6.1527 -1.3164 C 0 0 0 0 0 0 0 0 0 0 0 0
2.2060 5.3278 0.3232 H 0 0 0 0 0 0 0 0 0 0 0 0
0.5248 5.6950 0.7370 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.3313 6.0164 -1.9787 C 0 0 0 0 0 0 0 0 0 0 0 0
1.8171 5.8445 -2.0257 H 0 0 0 0 0 0 0 0 0 0 0 0
1.2260 7.1987 -1.0691 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.7651 4.5840 -2.2255 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.0727 6.5000 -1.3443 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.3168 6.5454 -2.9305 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.7914 4.5527 -2.5885 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.1204 4.1110 -2.9649 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.2603 4.4233 -0.1785 H 0 0 0 0 0 0 0 0 0 0 0 0
1.7623 2.6731 1.6128 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.2846 1.9413 -1.1719 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.4663 1.3874 -2.4590 C 0 0 0 0 0 0 0 0 0 0 0 0
2 1 1 0 0 0 0
3 1 1 0 0 0 0
4 2 1 0 0 0 0
7 4 1 0 0 0 0
5 4 1 0 0 0 0
9 5 1 0 0 0 0
10 5 1 0 0 0 0
8 5 1 0 0 0 0
12 8 1 0 0 0 0
13 8 1 0 0 0 0
11 8 1 0 0 0 0
15 11 1 0 0 0 0
16 11 1 0 0 0 0
14 11 1 0 0 0 0
17 14 1 0 0 0 0
18 14 1 0 0 0 0
6 3 1 0 0 0 0
19 6 1 0 0 0 0
4 6 1 0 0 0 0
14 6 1 0 0 0 0
2 20 1 0 0 0 0
22 21 1 0 0 0 0
3 21 1 0 0 0 0
M END

1#MImidazolidinylidene:1:-0.1153%-0.0449%0.7394 22#sArOrtho1:1:-1.3519%1.2880%-3.4523

<ATTACHMENT_POINT>
1:1 22:1

$$$$

GUI - improvements

In the graph inspector: change "fragId" to "building block ID" and remove "sprite"

Origin of best candidates in final summary of GA run

Users want a link to the position of the files pertaining the candidates that made it to the final population. Though the SDF with fitness is copied into the Final folder it would be good to have a pathname/link in the txt file of the Final.txt and of any other GenXXX.txt file.

GUI: file name in card name

The name of the file that was read-in (if any) combined with the tab identifier should appear somewhere in the GUI. The frame title is not a good place because it is not shown when the windows is full size. The best place if the "Active Tab" menu: add the filename (if the content comes from a file) and a tick mark to indicate which tab is currently on top.

random numbers

Need to streamline RandomUtils so that one can ask for a random number generator without having to do RandomUtils.initialiseRNG() and RandomUtils.getRNG().
This should use a machine random seed, unless a seed is given by the user.

running examples on Git-bash under Windows?

here is an example of what needs to be done to make the current run*.sh scripts work when running on Windows via Git-bash:

wrkDir=`pwd`
logFile="t2.log"
paramFile="t2.params"

wdToDenoptim="$wrkDir/"
if [[ "$(uname)" == CYGWIN* ]] || [[ "$(uname)" == MINGW* ]] || [[ "$(uname)" == MSYS* ]]
then
    wdToDenoptim="$(cd "$wrkDir" ; pwd -W | sed 's/\//\\\\/g')"
fi

mv data/* "$wrkDir"
rm -rf data

#Adjust path in scripts and parameter files
filesToModify=$(find . -type f | xargs grep -l "OTF")
for f in $filesToModify
do
    sed "$sedInPlace" "s|OTF_WDIR\/|$wdToDenoptim\\\\|g" "$f"
    sed "$sedInPlace" "s|OTF_WDIR|$wdToDenoptim|g" "$f"
    sed "$sedInPlace" "s|OTF_PROCS|$DENOPTIMslaveCores|g" "$f"
done

...

Note, however, this this should be done on the external fitness provider scripts as well.

vertex vs fragment

We should probably make a proper distinction between DENOPTIMVertex and DENPTIMFragment. This, for instance, to create a DENOPTIMFragment object from IAtomContiner without having a defined fragment space.

Different addAP(…) methods

Fragment has an addAP(…)-method with a different signature then the addAP(…)-method it inherits from Vertex. These methods also have different behaviors related to issue #49. We should definitely unify both the syntax and behavior of these methods.

I suggest we keep the signature from Vertex except for substituting the type of the direction vector parameter from []double to Point3d as Fragment's signature uses. Point3d better conveys that the direction vector is a 3-dimensional vector than []double, both in name and implementation.

Add Chemical representation to JSON?

The chemical representation (i.e., atoms, bonds, and all associated data) of a building block is stored in the library of building blocks , and upon reading in a graph we need to fetch the molecular representation of each building block. This have several consequences:

  1. The molecular object corresponding to a graph saved in JSON can be known only if the corresponding fragment space if loaded.
  2. Geometrical changes to the building blocks (e.g., geometrical adaptation to enable relaxation of cyclic graphs embedded in templates) are not stored in JSON format.
  3. A template scaffold cannot include molecular building blocks (i.e., "fragments") because when we read in the scaffold's library we have not yet read in the library of fragments.
  4. The direction vector of an AP has no coordinates of its starting point until that AP, via it's owner vertex, is associated with a molecular fragment and with a source atom in such fragment.

These are arguments in favor of including a light molecular representation on the JSON format of graphs and vertexes.

PS: Note that IAtoms can be part of multiple IAtomContainers: so an atom may belong to both the vertex AtomContainer and the whole molecule AtomContainer.

GUI: location of tmp files' space

Need generalized identification of a tmp file system. Once it's identifies, the tmp location should be passed to the general parameters to avoid having to specify the same info multiple times.

GraphViewer from run inspector

We could add the possibility to open the graph representation in the left panel of the GUIInspectFSERun.
The best seems to let the user choose whether to display only the molecualr representation of the overall graph (current strategy), or include that as part of the GUIGraphHandler. In hte latter case, in addition to displaying the panel with the molecualr representation of the overall graph, we also display the graph representation and the node content (upon clicking on a node).

Load Library of Vertexes

In the graph handler, there is a button called "Load Library of Vertexes".

  1. I think vertex should be vertices in plural form.
  2. I might be wrong, but I think the name of this button it is somewhat misleading as it takes you to the input of all the different parts of the chemical space such as the scaffold and the compatibility matrix and not only vertices.

New Class: Ring Closing Vertex

In making a unit test I discovered a bug that assumed Ring Closing Vertices (RCVs) are not at the scaffold level. After some discussion with @marco-foscato I was made aware that the content of RCVs are not part of the final molecule, i.e. they do not contain real atoms or molecules, so it doesn't make sense to place them at the scaffold level. Furthermore, all Rings have a head and tail vertex and these are assumed to be RCVs, but this is not explicitly checked, so programmers are not prevented from breaking this assumption. No doubt this is a slippery slope and will surely produce subtle bugs like the one I found in the future that may be hard to detect. I'm sure there are other places in the code where vertices are assumed to be RCVs.

We should solve this issue by making the assumption that a vertex is an RCV explicit where appropriate to prevent programmers from breaking this constraint.

Right now an RCV is represented as a normal Fragment with a dummy atom inside and exactly*1 attachment point (AP). This AP has an APClass which signifies that it is a Ring Closing Attractor (RCA). The APClass can be one of three choices: ATPlus, ATMinus, and ATNeutral.

I suggest we make RCV its own class called "RingCloser" which inherits from DENOPTIMVertex. We should also make a separate Enum called "Attractor" which contain the three choices discussed above. Getting rid of the dummy atom may be difficult as it relates to issue #49.
After that, the next step will be to require RingClosers where the code assumes so. A good place to start can be to change the type of the head and tail vertex in the DENOPTIMRing class from DENOPTIMVertex to RingCloser.

Use of IAtomContainer tags

CDK's IChemObject provides a property map which can hold arbitrary information about a CDK object. DENOPTIM uses this property map to store information in the form of strings about which attachment points (APs) of a Fragment belong to which of its IAtoms, found in Fragment's IAtomContainer. This information is needed when converting a graph to an actual molecule. The information is duplicated as it is also stored in an AP's source atom-field which is really the only place it should be stored. The reason for also storing this in the property map is related to the GUI (maybe @marco-foscato can elaborate?).

There are two main disadvantages of storing the source atom information in the map:

  1. The IAtom and IAtomContainer takes on some of the responsibility which belongs to the graph, specifically APs, in integrating different Vertices into a coherent molecule. This is also related to issue #47.
  2. Increased coupling between the presentation layer and the domain layer [https://en.wikipedia.org/wiki/Business_logic#Business_logic_layer]. More specifically, the AP is stored on the IAtom as the AP would be written to the SDF files that DENOPTIM uses. Ideally the presentation layer should not dictate the design of the domain layer.

We should remove the use of the property map and provide a solution for the problem related to the GUI. Maybe @marco-foscato can provide some more concrete first steps?

Shortcuts

The word "run" in the two bottom shortcut buttons should probably be capitalized for consistency.

Replace "Reaction" with "APClass"

In ancient times the word "Reaction" was used to indicate what effectively is an attachment point class. This is why the txt format of the compatibility matrix used "RCN" and "RBO" keywords, where the "R" stands for "Reaction". Since, nowadays the APClass concept is established, the use of RCN and RBO is not understandable. Replace RCN with CMP (as for "compatibility", and RBO to CBO (as fro class-to-bond order).

AP source atom

The identification of source atom could be made the responsibility of Fragment and removed from AttachmentPoint.

Load Library of Vertexes

In the graph handler, there is a button called "Load Library of Vertexes".

  1. I think vertex should be vertices in plural form.
  2. I might be wrong, but I think the name of this button it is somewhat misleading as it takes you to the input of all the different parts of the chemical space such as the scaffold and the compatibility matrix and not only vertices.

building large graphs that are rejected by filter

The EAUtils.buildGraph methods seems to build graphs irrespective of the molecular size limits, and then the graph is rejected by EAUtols.evaluateGraph. This is inefficient. Make EAUtils.buildGraph realize that it's building too large and thus stop it from adding more fragments.

To trigger this behavior, just use a substitution probability function that returns large values at high level ID.

Search for fragments

It would be nice to be able to search a library for fragments with specific APClasses (or other features) in the GUI.
The search functionality should be visible from both GUIFragmentInspector and GUIFragmentSelector, but not in any FragmentViewPanel, for instance, not visible when displaying fragments from GUIGraphHandler.
So the search bar could be placed in the FragmentViewPanel, but Displayed only unpon request of the parent component hosting that panel.

Problems running tests

Dear developers,

I have installed DENOPTIM under MacOS Big Sur, and the first test fails in the first molecule with error (from the log file)

denoptim.exception.DENOPTIMException: java.io.FileNotFoundException: /tmp/denoptim_test/t1/MOL000001_cs0.int_2 (No such file or directory)
at denoptim.integration.tinker.TinkerUtils.readTinkerIC(TinkerUtils.java:274).

I have the Tinker directory well set, and I am using java and javac versions 1.8. Python and bash are also in the $PATH. This is all troubleshooting I did following the installation instructions, but I can't get it to work... Some help would be appreciated.

Best,
Ferran

fitness providers

Need to move parameters for fitness provider to their own class, possibly within denoptim package

JUnit

We need to add JUnit for unit tests

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.