grimme-lab / dxtb Goto Github PK

View Code? Open in Web Editor NEW

8.0 3.0 1.0 2.91 MB

Efficient And Fully Differentiable Extended Tight-Binding

License: Apache License 2.0

Python 100.00%

computational-chemistry pytorch quantum-chemistry tight-binding automatic-differentiation

dxtb's Introduction

Fully Differentiable Extended Tight-Binding

- Combining semi-empirical quantum chemistry with machine learning in PyTorch -

The xTB methods (GFNn-xTB) are a series of semi-empirical quantum chemical methods that provide a good balance between accuracy and computational cost.

With dxtb, we provide a re-implementation of the xTB methods in PyTorch, which allows for automatic differentiation and seamless integration into machine learning frameworks.

Installation

pip

dxtb can easily be installed with pip.

pip install dxtb

conda

dxtb is also available on conda.

conda install dxtb

Other

For more options, see the installation guide in the documentation.

Example

The following example demonstrates how to compute the energy and forces using GFN1-xTB.

import torch
import dxtb

dd = {"dtype": torch.double, "device": torch.device("cpu")}

# LiH
numbers = torch.tensor([3, 1], device=dd["device"])
positions = torch.tensor([[0.0, 0.0, 0.0], [0.0, 0.0, 1.5]], **dd)

# instantiate a calculator
calc = dxtb.calculators.GFN1Calculator(numbers, **dd)

# compute the energy
pos = positions.clone().requires_grad_(True)
energy = calc.get_energy(pos)

# obtain gradient (dE/dR) via autograd
(g,) = torch.autograd.grad(energy, pos)

# Alternatively, forces can directly be requested from the calculator.
# (Don't forget to reset the calculator manually when the inputs are identical.)
calc.reset()
pos = positions.clone().requires_grad_(True)
forces = calc.get_forces(pos)

assert torch.equal(forces, -g)

For more examples and details, check out the documentation.

Citation

If you use dxtb in your research, please cite the following paper:

M. Friede, C. Hölzer, S. Ehlert, S. Grimme, dxtb -- An Efficient and Fully Differentiable Framework for Extended Tight-Binding, J. Chem. Phys., 2024

The Supporting Information can be found here.

For details on the xTB methods, see

C. Bannwarth, E. Caldeweyher, S. Ehlert, A. Hansen, P. Pracht, J. Seibert, S. Spicher, S. Grimme, WIREs Comput. Mol. Sci., 2020, 11, e01493. (DOI)
C. Bannwarth, S. Ehlert, S. Grimme, J. Chem. Theory Comput., 2019, 15, 1652-1671. (DOI)
S. Grimme, C. Bannwarth, P. Shushkov, J. Chem. Theory Comput., 2017, 13, 1989-2009. (DOI)

Contributing

This is a volunteer open source projects and contributions are always welcome. Please, take a moment to read the contributing guidelines.

License

This project is licensed under the Apache License, Version 2.0 (the "License"); you may not use this project's files except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

dxtb's People

Contributors

Stargazers

Watchers

Forkers

aqhali

dxtb's Issues

Hessian Matrix

Calculation of hessian matrix does not work with McMurchie-Davidson implementation.

Logic behind new interface

Arguably the main functionality of dxtb will be to calculate

Energies
Gradients
(Higher derivatives)

To better grasp the design choices behind the new interface (so far docs are not online), I am wondering about the design choices. What is the intended way to calculate energies and forces?

calc = Calculator(numbers, parametrisation, opts)

results = calc.singlepoint(numbers, positions, charges)

gradient = -calc.forces_analytical(numbers, positions, charges).detach()

Is that how it should be done? Ideally, a new user will intuitively do this right.
Imo in the upper approach calc.forces_analytical should not require a -1 , as per definition F = - nabla(U).
Also, I am wondering about the content of the results object the keys gradient and total_grad for instance are not relevant anymore, or are they? And which key unambiguously returns the (total) energy as required by most users?

Suggestions

Include wrapper function calc.get_energy(...) which under the hood returns calc.singlepoint(...).total.detach().sum()
Include wrapper function calc.get_force(...) which under the hood returns -calc.forces_XXX(...) (whereby XXX can be analytical per default)

To me, this feels like a more natural approach. Happy to discuss.

Faster integral evaluation

We need to get much faster in evaluating the integrals as the Hamiltonian and overlap calculation is a substantial part of the runtime of the SCF (~10s out of ~11s calculation, in tblite ~0.5s for integrals + SCF).

Potential solutions are

optimize with just-in-time compilation via torchscript and tracing of routines
use libtorch to implement the integral evaluation directly in C++
use libcint and create a torch compatible wrapper

IndexHelper reduce/spread not working in batched mode for two dimensions

Reducing/Spreading a tensor of shape (b, n, n) does not work.
Something like reduce_orbitals_to_atom(x, dim=(-2, -1)) fails.

CPU parallelization

The current code is not CPU parallelized and practically uses only a single core.

Tested on standard Ubuntu 20.04 Machine

Batching leads to longer runtime

Larger batch sizes lead to longer run times per epoch (a singlepoint calculation for all samples in the dataset). This should not be the case. Probably this is related to the missing complete vectorization of the singlepoint calculation.

See minimal example:

dataset = SampleDataset.from_json(path_to_data)
BATCH_SIZE = 64

dataloader = DataLoader(
        dataset,
        batch_size=BATCH_SIZE,
        collate_fn=Sample.pack,
    )


def singlepoint(x: Sample):
    options = {"verbosity": 0}
    parametrisation = GFN1_XTB.copy(deep=True)

    positions = x.positions.type(torch.float64)
    calc = Calculator(x.numbers, parametrisation, opts=options)
    results = calc.singlepoint(x.numbers, positions, x.charges, grad=True)

    return results.total, results.total_grad

# measure time for epoch in dependence of batch size
start_time = time.time()
for batch in dataloader:
    e, g = singlepoint(batch)
end_time = time.time()

print("Time for epoch: ", end_time - start_time)
# bs:  2 -- 112.24s
# bs: 16 -- 120.29s
# bs: 32 -- 123.56s
# bs: 64 -- 125.18s

Second gradient gives NaN in torch functions with `torch.autograd.set_detect_anomaly(True)`

I already encountered this issue during parameter optimization and reimplemented the troubling functions manually (cdist and norm).

The issue is easily reproduced by adding torch.autograd.set_detect_anomaly(True) to the SCF test, which causes test_gradgrad to fail. However, without anomaly detection the test does not throw an error. While it does not appear to cause any harm at the moment, we should remember this issue.

Einsum broadcasting

Während der Singlepoint Berechnung tritt folgender Fehler auf:

File "/home/ch/PhD/projects/xtbML/src/xtbml/coulomb/secondorder.py", line 230, in get_shell_potential
    torch.einsum("...ik,...k->...i", cache.mat, charges)
  File "/home/ch/programs/miniconda3/envs/xtbML/lib/python3.8/site-packages/torch/functional.py", line 330, in einsum
    return _VF.einsum(equation, operands)  # type: ignore[attr-defined]
RuntimeError: einsum(): operands do not broadcast with remapped shapes [original->remapped]: [6, 53, 53]->[6, 53, 53] [6, 52]->[6, 1, 52]

Gegeben für folgenden batch:

batch: Sample(PTB:HCNO/adamantane+PTB:HCNO/ADD2_11+PTB:HCNO/ADD2_12+PTB:HCNO/ADD2_2+PTB:HCNO/ADD2_3+PTB:HCNO/ADD2_6)

@marvinfriede Ist das bereits bekannt?

Code style

What kind of code style do we want to adapt?

indentation: spaces (2 or 4) vs. tabs
formatter (autopep8 or black)
linter (pylint or flake8)

As @hoelzerC and I use VSCode, I prepared a settings.json file with my usual settings here.

Code cleanup

To prepare for first release, cleanup our repository

remove external libraries (#77)
remove dead code (#77)
include parametrisation scripts and driver

Backward gradient tests fail if `use_potential=False`

For example, this test

pytest test/test_scf/test_grad.py::test_grad_backwards[LiH-dtype0]

fails if we use iterate_charges instead of iterate_potential. This can be changed by giving opts = {"use_potential": False} to the Calculator.

Collection of nitpicks and fixes

This issue should serve as a collection of nitpicks and things to fix before it is released. Note that this should only include minor issue and no new features.

Dispersion energy contribution

Since DFTB/xTB are derived from semi-local DFT they don't include dispersion interactions by default. However, those are crucial for any larger systems, especially intramolecular NCIs like present in conformers.

Fortran reference implementations:

DFT-D4: dftd4/dftd4
DFT-D3: awvwgk/simple-dftd3

Pytorch implementations

DFT-D3: pfnet-research/torch-dftd

Support electronic temperature

Currently we use the aufbau principle, i.e. a heaviside function for determining the occupation. Having a Fermi distribution for occupation numbers would be preferable to allow handling static correlation cases better.

D3 dtype error

Test suit is failing on a024de0

FAILED test/test_singlepoint/test_energy.py::test_single[implicit-H2-dtype0] - TypeError: Reference.to() got an unexpected keyword argument 'dtype'
stemming from
dxtb/src/dxtb/components/classicals/dispersion/d3.py", line 109, in get_cache

Using tad-dftd3 0.1.3. newer version required?

Batched SCF

Replace unreliable convergence check of xitorch.
Individual checks in batched calculation? Culling?

Overlap autograd requires high tolerances

Gradcheck for mmd routine seems to work fine, however, the general setup of overlap requires large tolerances [1]. Furthermore, PyTorch Jacobian and numerical Jacobian do not match correspondingly [2].

See required tolerances in:

One possible approach is to test OS overlap scheme for a better autograd agreement.

Minimal single point calculation

This is our first milestone for getting a (minimal) working tight binding implementation:

Input
- loading of input data (batch of structures/geometries)
- representation of parametrization data (Pydantic dataclasses)
Calculator
- initialize from parametrization data
Single point (non-selfconsistent, see src/tblite/xtb/singlepoint.f90)
- compute overlap integrals and store in dense format
- transform overlap to Hamiltonian: #2
- solve electronic eigenvalue problem
- compute density matrix and populations/partial charges
- return electronic energy

Cr mismatch fortran xtb -- pytorch xtb

For xtb version 6.5.1 (579679a) and pytorch xtb there is a significant difference in e.g. calculated gradient values.

Compare xtb coord --gfn 1 --grad and singlepoint on the sample given below.

Sample:

$coord
    0.03579425573719     -0.09112374056391      0.89335307999458      cr
    5.42783204665614     -2.17294661121026      0.65916514368075      o
    3.40375271909106     -1.38928570065300      0.82378719806096      c
   -0.04074270812561     -0.17596705876656     -2.57265467283161      c
   -0.08912415394703     -0.22969604472894     -4.75994430975133      o
   -3.33299931144879      1.20626988769390      0.90894089019304      c
   -5.36291624030506      1.98396405987989      0.79581613233819      o
    1.33083886695733      3.27583741049000      0.75516281864174      c
    2.10688687476585      5.29904198098028      0.54889443619263      o
   -1.26102459345897     -3.45925462128772      0.97643580779788      c
   -2.04398697392212     -5.48898687883367      0.90363185868314      o
$end

Repulsion energy contribution

Pure geometric contribution required to calculate total energy from electronic energy (see #3). Can be evaluated independently from Hamiltonian and SCF.

Fortran reference implementation:

damped nuclear–nuclear repulsion: src/tblite/repulsion/effective.f90

Analytical SCF gradient

Since we want to use the SCF gradient as part of our loss function, it would be preferable if it could be calculated analytically with the forward pass rather than using the backwards propagation.

Fix xitorch adaptation

For convergence of test/test_scf/test_scf_elements.py, an adaptation of the xitorch module is required.
See:
#72 (comment)

For portability reasons, this should be fixed for production.

Refactor repulsion

The repulsion code is rather ugly and anything but straightforward.

Performance Improvements

profile code and identify slow parts
improve bottlenecks

Some problems and potential code pieces that can be improved were already found:

loop over unique pair to find correct position in full overlap matrix (#107)
loop over primitives in overlap gradient (#108 (comment))
loop over position vector in overlap gradient (#108 (comment))

Slow tests go out-of-memory

Some tests have a quite a huge memory footprint of >4G when running with the whole suite with pytest (they start swapping on my laptop). We should try to keep the things we are testing reasonably sized.

Simplify integral code

do not store integral matrices internally in integral classes
pass overlap matrix to higher-order multipole moments instead of accessing internal storage
simplify integral class vs. integral implementation (should improve device moving via .to as well)
store matrices via caching in calculator
fix test_hamiltonian/test_grad

Hamiltonian matrix elements

Calculation of Hamiltonian integrals for the xTB methods. The aim is to calculate the Hamiltonian in diatomic blocks of matrix elements, each diatomic block can be calculated as function of the overlap matrix elements (see #1), angular momenta of the shells and the self-energy of the shells.

Fortran reference implementation

self-energy is dependent on environment via coordination number (get_selfenergy): src/tblite/xtb/h0.f90
different coordination number schemes available: src/dftd3/ncoord.f90, src/tblite/ncoord/gfn.f90
representation of function to calculate Hamiltonian elements from overlap: src/tblite/xtb/h0.f90
atomic block wise calculation of matrix elements (buildDiatomicBlocks): src/dftbp/extlibs/tblite.F90

Note:

ignore multipole integrals (only GFN2-xTB)

Overlap matrix elements

Calculation of overlap integral related terms in xTB Hamiltonian. The aim is to calculate diatomic blocks (off-site elements) of matrix elements, usually containing up to three shells with angular momenta up to l=2 (d-orbitals).

Fortran implementation reference

storage type for basis set used in tblite (array of structures style via cgto_type): src/tblite/basis/type.f90
expansion of Slater type orbitals to contracted Gaussian type orbitals: src/tblite/basis/slater.f90
horizontal Obara–Saika recursion for overlap integral evaluation in Cartesian atomic orbitals: src/tblite/integral/overlap.f90
transformation from Cartesian atomic orbitals to spherical atomic orbitals: src/tblite/integral/trafo.f90
Gram-Schmidt orthonormalization (mainly for GFN1-xTB 1s2s basis on H): src/tblite/basis/ortho.f90
atomic block wise calculation of matrix elements (buildDiatomicBlocks): src/dftbp/extlibs/tblite.F90

Relevant test cases:

self-overlap of contracted Gaussian type orbitals should be unity

Refactor src directory

Was there a specific reason to have ./src/dxtb instead of ./dxtb in the root directory of the package?

Pytorch version

Could we fix the issue with pytorch 1.11 vs 1.12
Especially for the torch.scatter_reduce signature, a workaround could be needed (Docs v1.11 vs. v1.12).

Charge dependent Hamiltonian

Electrostatic interactions enter DFTB/xTB Hamiltonian as charge dependent energy expression. Allows evaluation of electronic energy, which is calculated from core Hamiltonian, H0 (see #2) and charge-dependent Hamiltonian, H1.

Workflow:

solve eigenvalue problem with Hamiltonian
obtain density matrix
calculate populations / partial charges
evaluate (charge-dependent) energy expression
obtain potential / field, i.e. derivative of energy expression w.r.t. populations / partial charges (sign convention!)
create charge-dependent Hamiltonian
repeat until convergence is reached

Fortran reference implementation:

storage for density matrix, populations / partial charges: src/tblite/wavefunction/type.f90
population analysis / partial charge calculation from density matrix: src/tblite/wavefunction/mulliken.f90
container prototype for evaluating (charge-dependent) energy expression: src/tblite/container/type.f90
- actual implementation of second-order electrostatic energy: src/tblite/coulomb/charge.f90
- actual implementation of third-order electrostatic energy: src/tblite/coulomb/thirdorder.f90
evaluation of charge-dependent Hamiltonian: src/tblite/scf/potential.f90
iterator for converging self-consistent field problem: src/tblite/scf/iterator.f90
- requires convergence accelerator, usually Broyden mixing: src/tblite/scf/broyden.f90

Note:

ignore multipole electrostatics (only GFN2-xTB)

SCF Memory remnants

Somehow during singlepoint calculation, remnants remain in memory (decreasing with higher epochs).

Any ideas?

Pseudocode:

for i in range(10):
  calc = Calculator(sample.numbers, sample.positions, GFN1_XTB)
  results = calc.singlepoint(sample.numbers, sample.positions, sample.charges, verbosity=0)

Output:

Sample(PTB:HCNO/bisacridin3O+)

epoch 0
memory In: 0.007266998291015625 <-- memory increase w.r.t. to previous epoch (in Gb)

epoch 1
memory In: 0.01451873779296875

epoch 2
memory In: 0.000274658203125

epoch 3
memory In: 0.0

...

epoch 9
memory In: 0.0

NOTE: at this point no tensor requires grad.

Wrappers for gradient utility

Provide a unique interface to a jacobian function using the signature of torch.func.jacref. This should cover also cover torch.autograd.functional.jacobian in case the user does not have PyTorch 2.0.0+ and no functorch installation.

Add Citation

Once the paper is published, the citation should be added in the following places:

CITATION.bib file
README.md
docs/source/04_about/literature
https://github.com/grimme-lab/dxtb-data/tree/main/supporting-information

Cutoff for integral calculation

Currently, we are also calculating the overlap of all pairs, which is obviously inefficient for large systems with distant pairs.

Additionally, very small numbers close to zero appear to slow down the SCF. In particular, these Einstein summations become slow, and the diagonalization as well. This became apparent in my timing tests, where the SCF with the old overlap was significantly faster (~500 atoms: ~5s per iteration faster for a 1-2s iteration).

Geometry representation

Discussion on implementing a suitable geometry representation

Options:

Requirements:

non-sparse representation of all geometric information of the system (atomic coordinates etc.)
suitable handling of nearest neighbor listings in terms of usability and computational efficiency
usability for larger systems (maybe compatible for parallelization)
extensibility for further Hamiltonian and optimization implementations

Torch Dftd3

Combination of basis ase.Atoms and pytorch.Tensor representation used
neighboring calculation (https://github.com/pfnet-research/torch-dftd/blob/master/torch_dftd/functions/edge_extraction.py)
pbc apparently available (low prio)

Tbmalt

purely pytorch based, conversion from hdf5 and ase possible
sophisticated basic class, usability with non-sparse representations should be possible(?)
interoperability with rest of tbmalt given
no pbc available (low prio)