lab-cosmo / kernel-tutorials Goto Github PK
View Code? Open in Web Editor NEWA set of utilities and pedagogic notebooks for the use of linear and kernel methods in atomistic modeling
License: GNU Lesser General Public License v3.0
A set of utilities and pedagogic notebooks for the use of linear and kernel methods in atomistic modeling
License: GNU Lesser General Public License v3.0
currently under tests/ etc. working off of CSD
By default no random seed is set for the train/test partition. It would be nice to get consistent runs would be to set by default the seed of, I guess, numpy.
Sklearn has a preprocessing function called KernelCenterer that is called in kernel functions. We need to check if this centering is in line with ours and if not, write our own analogous function.
1_LinearMethods.ipynb
Error message is received when running the following cell.
The variable name should be 'n_to_select' after its change in scikit-cosmo.
var_dict = load_variables()
locals().update(var_dict)
--------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-82d4ac1ba4d8> in <module>
----> 1 var_dict = load_variables()
2 locals().update(var_dict)
kernel-tutorials/utilities/general.py in load_variables(cache_name, **kwargs)
93 X, y = load_csd_1000r(return_X_y=True)
94 data = dict(X=X, Y=y, indices=np.array([]))
---> 95 return calculate_variables(**dict(data), **kwargs)
96
97
kernel-tutorials/utilities/general.py in calculate_variables(X, Y, indices, n_atoms, N, n_FPS, kernel_func, i_train, i_test, n_train, K_train, K_test)
115
116 if n_FPS is not None and n_FPS < X.shape[1]:
--> 117 fps_idxs = FPS(n_features_to_select=n_FPS).fit(X).selected_idx_
118 print("Taking a subsampling of ", n_FPS, "features")
119 X = X[:, fps_idxs]
miniconda3/lib/python3.9/site-packages/skcosmo/feature_selection/_base.py in __init__(self, **kwargs)
15
16 def __init__(self, **kwargs):
---> 17 super().__init__(selection_type="feature", **kwargs)
18
19
miniconda3/lib/python3.9/site-packages/skcosmo/_selection.py in __init__(self, initialize, **kwargs)
817 self.initialize = initialize
818
--> 819 super().__init__(
820 **kwargs,
821 )
TypeError: __init__() got an unexpected keyword argument 'n_features_to_select'
Centering Class with does pre-processing and uncentering
(This issue was found by Aditi)
The tutorial requires the librascal package, so it would be helpful to have installation instructions in the foreword notebook (the most obvious attempts, pip install librascal
and pip install rascal
, don't work - in fact, the latter installs a completely different package that just happens to have the same name). The condensed installation instructions for librascal would be:
$ git clone https://github.com/cosmo-epfl/librascal.git
$ cd librascal
$ pip install .
And for installing scikit-cosmo
, perhaps also mention that it's called skcosmo
on pip?
Thanks!
Hello!
I have been looking into SparseKPCovR
from the Utility Class.
It seems that the predicted targets are missing the average of the training set, i.e the correct prediction of "y" should be "y_pred -> y_pred + Y_train.mean(axis=0)". I can't think of why this is happening from looking at the code.
I noticed that in the 4th notebook, the correlation matrix C_pca
is normalized. However, this is not the case in the Utility Class.
it's both faster and more stable.
In notebook 5_CUR section 5.4, the table_from_dict() functions has a T_test2
argument that is not definied before.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.