luigibonati / mlcolvar Goto Github PK
View Code? Open in Web Editor NEWA unified framework for machine learning collective variables for enhanced sampling simulations
License: MIT License
A unified framework for machine learning collective variables for enhanced sampling simulations
License: MIT License
https://groups.google.com/g/plumed-users/c/LQJiFVrYdzE
We have successfully trained a deep-TICA model using the .rbias, which is suitable for our protein-ligand system. We selected 4536 descriptors, representing the interatomic distances between heavy atoms. However, during simulations with these 4536 descriptors, we observed lower computational efficiency. Similar to the Chignolin Folding case you mentioned in your article "Deep learning the slow modes for rare events sampling," we read your suggestions on reducing the descriptor set.
In your article, you mentioned reducing the number of descriptors to 210 by selecting the most relevant ones through sensitivity analysis of the primary CVs. We referred to your article and code https://colab.research.google.com/drive/1dG0ohT75R-UZAFMf_cbYPNQwBaOsVaAA#scrollTo=05ARhiNhSI_D and encountered some issues during testing. We hope you can provide assistance:
1、We faced issues in the variance calculation part and are unsure if the script is suitable for deep-TICA data. How should we modify it to adapt to deep-TICA data?
standardize_inputs = True #@param {type:"boolean"}
if multiply_by_stddev:
if standardize_inputs:
dist2 = (dist - Mean) / Range
else:
dist2 = dist
in_std = np.std(dist2, axis=0)
2、We encountered problems in the weight summation part of the function, specifically, we found that
model.nn[0].weight[:,i].abs().sum().item() throws an error: "TypeError: 'FeedForward' object is not subscriptable." Could you guide us on how to resolve this issue?
3、Could you share the code mentioned in the Chignolin Folding case in the article "Deep learning the slow modes for rare events sampling" applicable to reducing the descriptor set for deep-TICA data?
We appreciate your assistance and look forward to your guidance. Meanwhile, we are sharing our test code with you for a better understanding of our issues.
test-code.zip
local exec of codecov returns correct test coverage, while the one running on CI reports 0%
Btw, the fact that partial results are hidden inside training_step()
seems a common problem for inheritance. Because of this, we need to evaluate the encoder
twice in each step here. In the future, we might think of a way to make this easier, like having the CVs implementing a evaluate_loss()
method that takes a bunch of variables that currently have scope only within training_step()
(e.g., the result of the encoder).
Originally posted by @andrrizzi in #62 (comment)
Currently they auxiliary loss functions are saved in a normal list
, but we could save them in a ModuleList
. I think this way if the loss function has trainable parameters (e.g., a decoder for reconstruction losses) they will be optimized as well.
CI is failing due to an error in compute_fes
, which is thrown by FFTKDE
, but originally by scipy.brentq
function.
since this started appearing in scipy=1.11.*
, i would add a previous version in the requirements until this is fixed in either kdepy/scipy
@andrrizzi and I thought this to improve clarity
This is because this method is the one which:
Some things that @andrrizzi suggested to clean up:
Breaking changes:
Internal changes:
To improve clarity, also within training_step where we combine options with other args from data (e.g. weights)
LabeledDataset --> TensorDataset
Dataloader --> FastTensorDataloader
I create a transform
branch and move there the transform functions which act directly on the atomic positions, waiting for a more stable API
Pytorch==1.13.1
In the beginning, I start to compile the torch module of plumed with libtorch-cxx11-abi-shared-with-deps-1.13.1+cpu, then I found the error "undefined reference to ‘powf@GLIBC_2.27'".
As it is not wise to update GLIBC, I then try to use libtorch-1.13.1 without ABI, then another error occurred:
"""collect2: error: ld returned 1 exit status
Makefile:499: recipe for target 'plumed' failed
make[4]: *** [plumed] Error 1
make[4]: Leaving directory '/home/jyzha/tmp/plumed-2.9.0/src/lib'
Makefile:110: recipe for target 'all' failed
make[3]: *** [all] Error 2
make[3]: Leaving directory '/home/jyzha/tmp/plumed-2.9.0/src/lib'
Makefile:8: recipe for target 'lib' failed
make[2]: *** [lib] Error 2
make[2]: Leaving directory '/home/jyzha/tmp/plumed-2.9.0/src'
Makefile:33: recipe for target 'lib' failed
make[1]: *** [lib] Error 2
make[1]: Leaving directory '/home/jyzha/tmp/plumed-2.9.0'
Makefile:21: recipe for target 'all' failed
make: *** [all] Error 2"""
In fact, I have tried to install GLIBC-2.27 in my own path. However, it throws out a weird error:
"""make -r PARALLELMFLAGS="" -C .. objdir=`pwd` all
make[1]: Entering directory `/home/jyzha/tmp/glibc-2.27'
make subdir=csu -C csu ..=../ subdir_lib
make[2]: Entering directory `/home/jyzha/tmp/glibc-2.27/csu'
/usr/bin/install -c -m 644 /home/jyzha/tmp/glibc-2.27/build/cstdlibT
/usr/bin/install: missing destination file operand after ‘/home/jyzha/tmp/glibc-2.27/build/cstdlibT’
Try '/usr/bin/install --help' for more information."""
I am looking forward to your advice on installing the torch module. I strongly require this tool in my work and I am also willing to contribute to mlcolvar. Thank you~
Calculate averages and std dev for input/output normalization at the batch level
At the moment, the training crashes when DictModule
does not split the dataset into training and validation because Lightning
always calls val_dataloader()
.
The function model.to_torchscript is difficult to find in tutorial notebooks.
They should be improved.
Currently, FastDictionaryLoader
support multiple datasets only if they have the same number of samples. We should look into whether it's possible to remove this limitation as it is probably the most common case.
Up to now, some tests are at the end of the specific files and some are defined in the test folder.
We should move all of them to the test folder and use whenever needed the pytest parametrize functions
What is the use of the walker
column in utils.io.load_dataframe
? Is it needed?
# check if file is in PLUMED format
if is_plumed_file(filename):
df_tmp = plumed_to_pandas(filename)
df_tmp['walker'] = [i for _ in range(len(df_tmp))]
df_tmp = df_tmp.iloc[start:stop:stride, :]
df_list.append( df_tmp )
# else use read_csv with optional kwargs
else:
df_tmp = pd.read_csv(filename, **kwargs)
df_tmp['walker'] = [i for _ in range(len(df_tmp))]
df_tmp = df_tmp.iloc[start:stop:stride, :]
df_list.append( df_tmp )
I tried and it seems like it only returns a column of zeros
In case it should be kept, create_dataset_from_files
should be modified to automatically exclude that column by default as it does with time
and labels
. Otherwise, when filter_args = None
it loads the (useless) walker
column
We should make a release to be uploaded on pypi
pyproject.toml
Some notebook tests fail on MacOs because some cell takes more than 300s to finish.
It happens randomly (sometimes it does sometimes it doesn't) but still it's annoying.
Should we check this in the tutorials and maybe use fake values and comment the real ones?
LR scheduler is already in utils.optim but it is not implemented in NNCV base class
Loading a DeepTDA CV from a checkpoint does not work:
Minimal (non)working example:
from lightning.pytorch.callbacks.model_checkpoint import ModelCheckpoint
checkpoint = ModelCheckpoint(save_top_k=1, monitor="valid_loss")
trainer = pl.Trainer(callbacks=[checkpoint],enable_checkpointing=True)
trainer.fit( model, datamodule )
best_model = DeepTDA.load_from_checkpoint(checkpoint.best_model_path)
given an error in initialization:
File [~/software/mambaforge/envs/mlcolvar/lib/python3.10/site-packages/mlcolvar/cvs/supervised/deeptda.py:67](https://file+.vscode-resource.vscode-cdn.net/home/lbonati%40iit.local/work/simulations/sampl5/OAMe/G2/~/software/mambaforge/envs/mlcolvar/lib/python3.10/site-packages/mlcolvar/cvs/supervised/deeptda.py:67), in DeepTDA.__init__(self, n_states, n_cvs, target_centers, target_sigmas, layers, options, **kwargs)
35 def __init__(
36 self,
37 n_states: int,
(...)
43 **kwargs,
44 ):
45 """
46 Define Deep Targeted Discriminant Analysis (Deep-TDA) CV composed by a neural network module.
47 By default a module standardizing the inputs is also used.
(...)
64 Set 'block_name' = None or False to turn off that block
65 """
---> 67 super().__init__(in_features=layers[0], out_features=layers[-1], **kwargs)
69 # ======= LOSS =======
70 self.loss_fn = TDALoss(
71 n_states=n_states,
72 target_centers=target_centers,
73 target_sigmas=target_sigmas,
74 )
TypeError: mlcolvar.cvs.cv.BaseCV.__init__() got multiple values for keyword argument 'in_features'
we should also check the other CVs and add regtests for this feature (as of now only regressionCV was tested in this notebook: https://mlcolvar.readthedocs.io/en/stable/notebooks/tutorials/intro_3_loss_optim.html#Model-checkpointing)
If the validation set is disabled the metrics are not returned anymore
Metrics are updated at the end of the validation epoch, as implemented in mlcolvar.utils.trainer.MetricsCallback
class MetricsCallback(Callback):
Lightning callback which saves logged metrics into a dictionary.
The metrics are recorded at the end of each validation epoch.
def __init__(self):
super().__init__()
self.metrics = {"epoch": []}
def on_validation_epoch_end(self, trainer, pl_module):
metrics = trainer.callback_metrics
if not trainer.sanity_checking:
self.metrics["epoch"].append(trainer.current_epoch)
for key, val in metrics.items():
val = val.item()
if key in self.metrics:
self.metrics[key].append(val)
else:
self.metrics[key] = [val]
pytorch_lightning has been renamed to just lightning
Dear Developers,
I try to run a tutorial provide in this link: https://mlcvs.readthedocs.io/en/latest/notebooks/ala2_deeplda.html
After training the model, I try to access the attribute loss_train
Then the code raises error:
... \py39mlcvs\lib\site-packages\torch\nn\modules\module.py:1269, in Module.__getattr__(self, name)
1267 if name in modules:
1268 return modules[name]
-> 1269 raise AttributeError("'{}' object has no attribute '{}'".format(
1270 type(self).__name__, name))
AttributeError: 'DeepLDA_CV' object has no attribute 'loss_train'
May be that attribute has not been implemented yet.
The implementation of the configure_optimizer method of BaseCV apparently doesn't allow the use of a lr_scheduler.
This could be included in the 'optimizer' options maybe, for example, in a quick and dirty way
def configure_optimizers(self):
"""
Initialize the optimizer based on self._optimizer_name and self.optimizer_kwargs.
Returns
-------
torch.optim
Torch optimizer
"""
lr_scheduler_dict = self.optimizer_kwargs.pop('lr_scheduler', None)
optimizer = getattr(torch.optim, self._optimizer_name)(
self.parameters(), **self.optimizer_kwargs
)
if lr_scheduler_dict is not None:
lr_scheduler_name = lr_scheduler_dict.pop('scheduler')
lr_scheduler = {
'scheduler': lr_scheduler_name(optimizer, **lr_scheduler_dict),
}
lr_scheduler_dict['scheduler'] = lr_scheduler_name
self.optimizer_kwargs['lr_scheduler'] = lr_scheduler_dict
return [optimizer] , [lr_scheduler]
else:
return optimizer
When using old versions of PyTorch (e.g., 1.10), building a mlcolvars.data.DictModule
may cause the following error:
raise ValueError("Sum of input lengths does not equal the length of the input dataset!").
And this is caused by the change of the torch.utils.data.random_split
method:
random_split
In PyTorch 2.1
:
def random_split(dataset: Dataset[T], lengths: Sequence[Union[int, float]],
generator: Optional[Generator] = default_generator) -> List[Subset[T]]:
r"""
Randomly split a dataset into non-overlapping new datasets of given lengths.
If a list of fractions that sum up to 1 is given,
the lengths will be computed automatically as
floor(frac * len(dataset)) for each fraction provided.
After computing the lengths, if there are any remainders, 1 count will be
distributed in round-robin fashion to the lengths
until there are no remainders left.
Optionally fix the generator for reproducible results, e.g.:
Example:
>>> # xdoctest: +SKIP
>>> generator1 = torch.Generator().manual_seed(42)
>>> generator2 = torch.Generator().manual_seed(42)
>>> random_split(range(10), [3, 7], generator=generator1)
>>> random_split(range(30), [0.3, 0.3, 0.4], generator=generator2)
Args:
dataset (Dataset): Dataset to be split
lengths (sequence): lengths or fractions of splits to be produced
generator (Generator): Generator used for the random permutation.
"""
random_split
In PyTorch 1.10
:
def random_split(dataset: Dataset[T], lengths: Sequence[int],
generator: Optional[Generator] = default_generator) -> List[Subset[T]]:
r"""
Randomly split a dataset into non-overlapping new datasets of given lengths.
Optionally fix the generator for reproducible results, e.g.:
>>> random_split(range(10), [3, 7], generator=torch.Generator().manual_seed(42))
Args:
dataset (Dataset): Dataset to be split
lengths (sequence): lengths of splits to be produced
generator (Generator): Generator used for the random permutation.
"""
This method is invoked by
mlcolvar/mlcolvar/data/datamodule.py
Line 211 in e356f24
_split
method passes dataset length fractions to the random_split
method, but the old random_split
method only accepts explicit dataset lengths as parameters. Thus, it may be reasonable to modify the code to pass actual data lengths.Dear @luigibonati
Can i have a silly question about the linear model implemented in your code at: https://github.com/luigibonati/mlcvs/blob/main/mlcvs/models/linear.py
You used
s = torch.matmul(X - self.b, self.w)
instead of a standard form y = weight * X + bias
Is there any specific purpose for that?
right now, loss_options is
so we need to change the order between 2 and 3
handle both labeled and unlabeled data
On the main page of the repository, the installation info is missing.
We should add:
Dear Developers.
Do you plan to update MLCVS that can allow to apply for per-atoms descriptors?
I mean the descriptors are computed as per-atom vectors than a global vector for a whole system
Thank you so much
lgtm.com is being shut down, need to migrate to code scanning feature
When preprocessing is included in the model, preprocessing-related variables, which should be buffers for the model, are not passed as they should. This raises problems when we try to do model.to(dtype/device)
.
We could change the way preprocessing is passed to the model. Instead of as an attribute, we can use a property and define a set_preprocessing
method which automatically loads the buffers of the preprocessing module into the model one (and remove them if preprocessing is set to None again).
We should also modify the Transform
classes in the transform
branch accordingly to have the needed variables saved as buffers, of course.
What do you think @andrrizzi?
Maybe we can also use the new approach to set the in/out_features
and the example_input
in the CV model in case of pre/postprocessing, maybe
CI is failing because provision-with-micromamba
has been migrated to setup-micromamba
so the test are not running
Add support for numpy arrays and/or dataframe
There are two issues in the reduced_rank algorithm.
least_squares
, however, I think that as it is written now it just skips the calculation. Indeed, once it enters the first if statement it does not enter also in the following elif.test_reduced_rank_tica
function:Traceback (most recent call last):
File "/home/[email protected]/work/code/mlcvs/mlcolvar/tests/test_core_stats_tica.py", line 7, in <module>
test_reduced_rank_tica()
File "/home/[email protected]/work/code/mlcvs/mlcolvar/core/stats/tica.py", line 190, in test_reduced_rank_tica
tica.compute([x_t,x_lag],[w_t,w_lag], save_params=True, algorithm='reduced_rank')
File "/home/[email protected]/work/code/mlcvs/mlcolvar/core/stats/tica.py", line 89, in compute
evals, evecs = reduced_rank_eig(C_0, C_lag, self.reg_C_0, rank = self.out_features)
File "/home/[email protected]/work/code/mlcvs/mlcolvar/core/stats/utils.py", line 82, in reduced_rank_eig
_, idxs = torch.topk(vectors.values, rank)
TypeError: topk(): argument 'input' (position 1) must be Tensor, not builtin_function_or_method
@pietronvll can you check this? thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.