Hello. I've just seen your pre-print about this work, and the comparison with spyrmsd performance caught my eyes (I'm the author of spyrmsd). Could you please clarify which backend did you use? I was unable to find it in your manuscript.
In our original published work, spyrmsd could use either the networkx or graph-tool backends. The former is the default (since it's widely available), but it's known to be slow.
Since version 0.7.0
of spyrmsd, released the 5th of April 2024, rustworkx is also supported, which is both fast and widely available.
rustworkx or graph-tool are the backends to be used for a fair comparison and benchmarks.
These are the timings I get running with the rustworkx and graph-tool backends for the molecules in testsets
(Table 1 in your manuscript) on a Apple M1 Pro:
System |
spyrmsd [rustworkx] (s) |
spyrmsd [graph-tool] (s) |
CanonizedRMSD.py (s) |
a |
0.309 |
0.266 |
0.230 |
b |
0.305 |
0.257 |
0.260 |
c |
0.307 |
0.250 |
0.402 |
d |
* |
* |
* |
e |
6.617 |
1.698 |
307.3 |
These are much faster than the ones you reported in your manuscript. For a fair comparison, I also run the test cases on the Apple M1 Pro with CanonizedRMSD.py
(same Python environment used for the spyrmsd timings).
Therefore, it seems that with the high-performant backends (rustworkx and graph-tool), spyrmsd is much faster than claimed in your pre-print, and performs significantly faster than CanonizedRMSD.py
(2 or 7 seconds instead of 5 minutes).
I'm not sure why the timing for e
is not reported for spyrmsd in your Table 1.
I was unable to run test d
with spyrmsd because I hit the following:
[21:43:36] Explicit valence for atom # 0 N, 4, is greater than permitted
[21:43:36] ERROR: Could not sanitize molecule ending on line 446
[21:43:36] ERROR: Explicit valence for atom # 0 N, 4, is greater than permitted
[21:43:36] Explicit valence for atom # 0 N, 4, is greater than permitted
[21:43:36] ERROR: Could not sanitize molecule ending on line 446
[21:43:36] ERROR: Explicit valence for atom # 0 N, 4, is greater than permitted
Given the size of the molecule (217 atoms) I haven't checked where this is coming from, but the error would indicate an issue with your input file.
Interestingly, I also get the same exception using CanonizeRMSD.py
:
rdkit.Chem.rdchem.AtomValenceException: Explicit valence for atom # 0 N, 4, is greater than permitted
Therefore, I'm not entirely sure if the molecule d
available in the repository is the same used for benchmarks.
If you only used the networkx, I would kindly ask to update your pre-print with the correct timings for spyrmsd using the graph-tool and/or rustworkx, especially Table 1, Figure 7, and Figure 8.
I'd be happy to provide any assistance, and I'll make rustworkx the default backend in future releases, give the potentials for misrepresentation of performance when using the current default networkx.