ddbourgin / numpy-ml Goto Github PK
View Code? Open in Web Editor NEWMachine learning, in numpy
Home Page: https://numpy-ml.readthedocs.io/
License: GNU General Public License v3.0
Machine learning, in numpy
Home Page: https://numpy-ml.readthedocs.io/
License: GNU General Public License v3.0
I was trying to run decision tree model using the very simple Iris dataset and I found a possible bug:
The problem is when the array "levels" only has one element, the threshold will be empty, and it will crush on line 164 when we call "gain.max()".
Please correct me if I'm wrong. Thank you for creating this useful repo!
I want to try to apply vector operations instead of cycles, which gives a performance boost. Are such improvements accepted? For example file 'gmm.py' function 'likelihood_lower_bound'. If so, is it possible to get a sample of the data that is fed into the function?
System information
Describe the current behavior
AttributeError: module 'numpy_ml' has no attribute 'load_dataset'
Describe the expected behavior
Code to reproduce the issue
import numpy_ml as npml
data = npml.load_dataset("data.csv")
Other info / logs
AttributeError Traceback (most recent call last)
Cell In [8], line 4
1 import numpy_ml as npml
3 # load dataset
----> 4 data = npml.load_dataset("data.csv")
AttributeError: module 'numpy_ml' has no attribute 'load_dataset'
so, maybe from some verison of the Numpy-ML, the attibute 'load_dataset' discontinued? but how should I replace this fuction to load data, and what the requirement for the dataset?
many thanks.
To easily use these models in a robust non-version-dependent way, an option to save/load model to/from json (either string or file) would be helpful.
E.g.
# save to json (can be stored anywhere, not just in file)
model_json = model.save_to_json()
# save to json file
with open(...) as f:
model.save_to_json_file(f) # create folder and parent folders if not exist
# load from json
model = Model.load_from_json(json)
# load from json file
with open(...) as f:
Model.load_from_json_file(f)
Could you update the requirements.txt to specify the version of each module?
like node2vec, HOPE, graph2vec
This is an awesome library, thanks @ddbourgin!!
Users might not know the best way to install this package and try it out. (I didn't, so I eventually just copied the source files.)
Neither the readme nor readthedocs have install instructions.
I couldn't find it on PyPi or Anaconda, and there doesn't appear to be a pyproject.toml
, setup.cfg
, setup.py
, or conda recipe.
Moreover, the tests aren't in a standard path like tests/
.
This is uncommon and therefore confusion, and it makes it harder to run them.
Edit: I wasn't expecting them under the source, so I initially wrote that I couldn't find them.
I think it would be great to document how to install numpy-ml, and run its tests & see them to clarify the behavior of some of the functions.
There are some great build and CI tools for Python available, which I recently learned how to use effectively. I'm happy to make a pull request if it would be helpful.
G'day, how's it going?
I've just started looking into machine learning stuff, and stumbled upon this, looks awesome!
I just want to know what kind of methods I should use for the following:
Kind regards,
Machine-Learning newbie, Mitch!
System information
Describe the current behavior
When calling the init_from_dict()
method from numpy_ml/neural_nets/initializers, from both SchedulerInitializer
and OptimizerInitializer
classes, the returned object is None
, rather than a propper object.
This is caused by the assignation of the set_params()
method to the returned object. Such a method does not return an object but modifies the instance itself.
numpy-ml/numpy_ml/neural_nets/initializers/initializers.py
Lines 176 to 194 in b537fac
Describe the expected behavior
init_from_dict()
should return a propper object that will be assigned to an attribute of a NN layer.
Code to reproduce the issue
from numpy_ml.neural_nets.layers import *
c1 = Conv2D(6, (3,3))
opt = c1.hyperparameters['optimizer'] # dict
c2=Conv2D(6, (3,3), optimizer=opt) # The optimizer is set to None
c2.hyperparameters
This raises AttributeError: 'NoneType' object has no attribute 'cache'
.
The same happens with Scheduler
.
I can't find K-means model, so I think I can coding one. Thanks!
System information
Describe the current behavior
import of numpy_ml fails due to an ImportError with the collections module.
Describe the expected behavior
Importing just the module should not generate an ImportError
Code to reproduce the issue
# Python 3.10 or newer
import numpy_ml
Other info / logs
In Python 3.10 the deprecated aliases were removed.
Remove deprecated aliases to Collections Abstract Base Classes from the collections module. (Contributed by Victor Stinner in bpo-37324.)
from What’s New In Python 3.10
fix
To fix the bug, in /numpy_ml/utils/data_structures.py change
from collections import Hashable
to
from collections.abc import Hashable
I write in Spanish, it is more handy to me. Now, everybody have got an online translator within the browser.
He revisado el código, no solo la documentación. Tu trabajo es muy valioso y de calidad, enhorabuena. En relación a redes neuronales, la ausencia de ejemplos ha sido una dificultad. Solo pude usar como ejemplos lo que hay en numpy_ml/neural_nets/models.
Sobre redes neuronales, la principal carencia es la diferenciación automática, lo que obliga a codificar un metodo de backpropagation por cada tipo de capa o módulo. Además impide el uso de funciones de coste personalizadas, cuya derivada no tiene una expresión analítica conocida.
Creo que ya estás trabajando en otros temas, creo que el repositorio no está mantenido. Si alguna vez se vuelve a activar este repositorio o alguien hace fork, sugiero que la parte de redes neuronales incluya diferenciación automatica.
Taking a quick look, some of the grad and grad2 functions might benefit from some optimizations. Here's on example:
numpy-ml/numpy_ml/neural_nets/activations/activations.py
Lines 38 to 39 in fce2acf
Here the function could be changed such that fn(x) is only computed once:
def grad(self, x):
fn_x = self.fn(x)
return fn_x * (1 - fn_x)
The extra mem used to store the calculation should be immediately collected after the function ends so that shouldn't be a problem. Would love a second opinion @ddbourgin before making a PR with the necessary changes.
Sorry for the newbie question, but I'm having a bit of trouble in trying to use the library.
Specifically what I'm trying to do is embedding training, like word2vec. So I am trying to setup and embedding matrix, loss, and optimization using Adam.
Any pointers would be greatly appreciated.
I don't see any lasso regression model in linear models.Can i implement the lasso regression model?
hi, I think neural_nets/utils/utils.py line 797 has a bug!
i0, i1 = i * s, (i * s) + fr * (d + 1) - d
j0, j1 = j * s, (j * s) + fc * (d + 1) - d
i0, i1 = i * s, (i * s) + fr
j0, j1 = j * s, (j * s) + fc
because fr and fc are already dilated size !
Hi, I just found this repo and I found that the network implementation is weak. Here I provide your some examples where you expand neural network implementation with tests covered:
Detailed explanation with Conv, BatchNorm, Relu, FullyConnected, UpSampling:
https://github.com/yiakwy/yiakwy.github.io/blob/master/Correlation%20Metrics/ConvNet/costImpl/ConvNet/layers.py
Support keras alike layers stacking syntax:
Unit test covered:
https://github.com/yiakwy/yiakwy.github.io/blob/master/Correlation%20Metrics/ConvNet/READMME.md
This implementation is aimed to give people the best understand of some operators like Upsampling.
Nowadays, many people has already understand how to implement operators in CPU devices. We call it vanilla implementation. It is still challenging to understand how to implement and optimize them in different devices.
When we talk about Neural Network as a Computing Graph, we are actually interested in how to implement operations planned in different devices. At least, I need to provide FLops needed in each operator as edge weight and understand how to implement them and distribute them in different devices.
Hi, I am thinking about implementing Naive Bayes Methods and make a pull request. But I am unsure about the Unit testing part, Should I compare the performance with the ScikitLearn library?
If conv is not trainable you must check and not update "W".
numpy-ml/numpy_ml/neural_nets/layers/layers.py
Line 2488 in 4f37707
eom
I think this line:
outer = np.zeros((2, 2))
should change to
outer = np.zeros((self.d, self.d))
System information
Describe the current behavior
Describe the expected behavior
Code to reproduce the issue
Other info / logs
Based on my understanding of l1 and l2 loss should be:
d_penalty = gamma * np.square(beta).sum() if p == "l2" else gamma * np.abs(beta).sum()
instead of :
d_penalty = gamma * beta if p == "l2" else gamma * np.sign(beta)
Line 171 in 165ad88
sourse : https://towardsdatascience.com/l1-and-l2-regularization-methods-ce25e7fc831c
Please double check!
Thanks
I will try to implement rest part of activation functions. Any better ideas for activation function?
Hello, this is a great project , I am learning how to implement model without sklearn/tensorflow , it really help me a lot .
I have a question on
numpy-ml/numpy_ml/linear_models/lm.py
Line 252 in 4f37707
l1norms(self.beta)
means the sum of all absulote value of each element in self.beta . I don't quite understand why the simple gamma * beta
stand for `L2 ?
PS: May I ask what IDE and code document plugin you are using ? I see some annotation don't beyond to latex , it would be nice to see beautiful math symbols than raw latex :)
It would be good to have HTMM implementation here like https://github.com/Charleo85/pyhtmm
System information
Describe the current behavior
There is a problem when trying to perform simple transfer learning techniques (loading the same parameters from another trained layers/models).
When setting layer params with a layer summary dictionary (generated with the summary()
method), the activation function can be overridden with a string due to the non-exclusive if-clauses:
numpy-ml/numpy_ml/neural_nets/layers/layers.py
Lines 119 to 127 in b537fac
forward()
method.
Describe the expected behavior
Layers that get their parameters with the set_params()
method should behave without errors.
Code to reproduce the issue
>>> import numpy as np
>>> from numpy_ml.neural_nets.layers import *
>>>
>>> c1 = Conv2D(6, (3,3))
>>> c2 = Conv2D(6, (3,3))
>>> x = np.random.randn(1, 32, 32, 3)
>>>
>>> y1 = c1.forward(x)
>>> y2 = c2.forward(x) # No problem here
>>>
>>> c2.set_params(c1.summary()) # The act_fn of c2 is overridden as a str
<numpy_ml.neural_nets.layers.layers.Conv2D object at 0x7f0bf9d405f8>
>>> y3 = c2.forward(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ramuri01/numpy-ml/numpy_ml/neural_nets/layers/layers.py", line 2822, in forward
Y = self.act_fn(Z)
TypeError: 'str' object is not callable
Hello! I have a question regarding the implementation of gradient function of WGAN-GP (https://github.com/ddbourgin/numpy-ml/blob/master/numpy_ml/neural_nets/losses/losses.py#L497). I'm not sure why epsilon
is added to X_interp_norm
. I'm getting the same gradient except for the epsilon term. Also, the gradient is computed with respect to GradInterp
, shouldn't the gradient be computed with respect to the mixed image x'
?
I could be wrong about some of this. Looking forward to hearing from you.
System information
Describe the current behavior
I have copy-pasted the code for numpy_ml.linear_model.LinearRegression
from github and did .fit()
and .predict()
on some dummy data. I got ValueError
on .predict()
like this:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[<ipython-input-10-4be896198177>](https://localhost:8080/#) in <module>()
----> 1 npml_lin_reg2_preds = npml_lin_reg2.predict(X_val)
2 npml_lin_reg2_preds[:10]
[<ipython-input-8-fc521849e158>](https://localhost:8080/#) in predict(self, X)
206 if self.fit_intercept:
207 X = np.c_[np.ones(X.shape[0]), X]
--> 208 return X @ self.beta
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 11)
Describe the expected behavior
Expected behaviour is that .predict()
doesn't generate ValueError
.
Code to reproduce the issue
not code, here is the link to the notebook: https://colab.research.google.com/drive/12q9r2j4-UpUrPnzvMiPC6rxafa73cY5L?usp=sharing
Other info / logs
Is there any documentation for usage to build a network?
I want to try to implement some simple network based on for example MNIST dataset.
If there is no documentation, i think we can write one. For example, in keras, we can have model built like this:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
flake8 testing of https://github.com/ddbourgin/numpy-ml on Python 3.7.1
$ flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
./numpy_ml/rl_models/trainer.py:75:61: F821 undefined name 'smooth_tot'
smooth_tot = tot_rwd if ep == 0 else (1 - sf) * smooth_tot + sf * tot_rwd
^
./numpy_ml/neural_nets/models/w2v.py:303:38: F821 undefined name 'smooth_loss'
smooth_loss = 0.99 * smooth_loss + 0.01 * loss if ix > 0 else loss
^
./numpy_ml/neural_nets/layers/layers.py:1845:14: F632 use ==/!= to compare str, bytes, and int literals
elif self.pool is "sum":
^
./numpy_ml/neural_nets/layers/layers.py:1848:14: F632 use ==/!= to compare str, bytes, and int literals
elif self.pool is "mean":
^
2 F632 use ==/!= to compare str, bytes, and int literals
2 F821 undefined name 'smooth_loss'
4
E901,E999,F821,F822,F823 are the "showstopper" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. These 5 are different from most other flake8 issues which are merely "style violations" -- useful for readability but they do not effect runtime safety.
name
name
in __all__
From this link:
https://stackoverflow.com/questions/56085669/convolutional-layer-in-python-using-numpy
and
https://numpy.org/doc/stable/reference/generated/numpy.tensordot.html
Z = np.tensordot(X_pad, weights, axes=3) + self.bias
Does this function is more relevant that using im2col?
Thank you for this package. I'm looking for some example on how to implement simple MLP (Multi Layer Perceptron) with this package. Any code snippets or tutorials are welcome.
Below is some code that I glue, but I have no idea on how to do backpropagation, I would like to have fit()
method implemented.
Thank you!
from numpy_ml.neural_nets.losses import CrossEntropy, SquaredError
from numpy_ml.neural_nets.utils import minibatch
from numpy_ml.neural_nets.activations import ReLU, Sigmoid
from numpy_ml.neural_nets.layers import FullyConnected
from numpy_ml.neural_nets.optimizers.optimizers import SGD
optimizer = SGD()
loss = SquaredError()
class MLP:
def __init__(self):
self.nn = OrderedDict()
self.nn["L1"] = FullyConnected(
10, act_fn="ReLU", optimizer=optimizer
)
self.nn["L2"] = FullyConnected(
1, act_fn="Sigmoid", optimizer=optimizer
)
def forward(self, X, retain_derived=True):
Xs = {}
out, rd = X, retain_derived
for k, v in self.nn.items():
Xs[k] = out
out = v.forward(out, retain_derived=rd)
return out, Xs
def backward(self, grad, retain_grads=True):
dXs = {}
out, rg = grad, retain_grads
for k, v in reversed(list(self.nn.items())):
dXs[k] = out
out = v.backward(out, retain_grads=rg)
return out, dXs
data = pd.read_csv('Data/Bankloan.csv', sep=';')
for i in ['debtinc', 'creddebt', 'othdebt']:
data[i] = data[i].str.replace(',', '.').astype('float')
train, test, y_train, y_test = train_test_split(data.drop('default', axis=1),
data['default'],
test_size=0.3,
stratify=data['default'],
random_state=42)
X_train = pd.get_dummies(train)
X_test = pd.get_dummies(test)
tree = DecisionTree(seed=42, max_depth=4, n_feats=2)
tree.fit(X_train.values, y_train.values)
ValueError Traceback (most recent call last)
in
1 tree = DecisionTree(seed=42, max_depth=4, n_feats=2)
----> 2 tree.fit(X_train.values, y_train.values)
in fit(self, X, Y)
78 self.n_classes = max(Y) + 1 if self.classifier else None
79 self.n_feats = X.shape[1] if not self.n_feats else min(self.n_feats, X.shape[1])
---> 80 self.root = self._grow(X, Y)
81
82 def predict(self, X):
in _grow(self, X, Y, cur_depth)
138
139 # grow the children that result from the split
--> 140 left = self._grow(X[l, :], Y[l], cur_depth)
141 right = self._grow(X[r, :], Y[r], cur_depth)
142 return Node(left, right, (feat, thresh))
in _grow(self, X, Y, cur_depth)
139 # grow the children that result from the split
140 left = self._grow(X[l, :], Y[l], cur_depth)
--> 141 right = self._grow(X[r, :], Y[r], cur_depth)
142 return Node(left, right, (feat, thresh))
143
in _grow(self, X, Y, cur_depth)
139 # grow the children that result from the split
140 left = self._grow(X[l, :], Y[l], cur_depth)
--> 141 right = self._grow(X[r, :], Y[r], cur_depth)
142 return Node(left, right, (feat, thresh))
143
in _grow(self, X, Y, cur_depth)
133
134 # greedily select the best split according to criterion
--> 135 feat, thresh = self._segment(X, Y, feat_idxs)
136 l = np.argwhere(X[:, feat] <= thresh).flatten()
137 r = np.argwhere(X[:, feat] > thresh).flatten()
in _segment(self, X, Y, feat_idxs)
155 gains = np.array([self._impurity_gain(Y, t, vals) for t in thresholds])
156
--> 157 if gains.max() > best_gain:
158 split_idx = i
159 best_gain = gains.max()
/anaconda3/lib/python3.7/site-packages/numpy/core/_methods.py in _amax(a, axis, out, keepdims, initial, where)
28 def _amax(a, axis=None, out=None, keepdims=False,
29 initial=_NoValue, where=True):
---> 30 return umr_maximum(a, axis, None, out, keepdims, initial, where)
31
32 def _amin(a, axis=None, out=None, keepdims=False,
ValueError: zero-size array to reduction operation maximum which has no identity
Link to dataset https://drive.google.com/file/d/1lj7qUyG7BOV6cAGm8-tDNUqS62IEgk5p/view?usp=sharing
Thanks for running this project ! , I am a beginner , there some issue I didn't understand that the function of _backward_naive in Conv2D
does not flip the weights, am I missing something
def _backward_naive(self, dLdy, retain_grads=True):
assert self.trainable, "Layer is frozen"
if not isinstance(dLdy, list):
dLdy = [dLdy]
W = self.parameters["W"]
b = self.parameters["b"]
Zs = self.derived_variables["Z"]
Xs, d = self.X, self.dilation
(fr, fc), s, p = self.kernel_shape, self.stride, self.pad
dXs = []
for X, Z, dy in zip(Xs, Zs, dLdy):
n_ex, out_rows, out_cols, out_ch = dy.shape
X_pad, (pr1, pr2, pc1, pc2) = pad2D(X, p, self.kernel_shape, s, d)
dZ = dLdy * self.act_fn.grad(Z)
dX = np.zeros_like(X_pad)
dW, dB = np.zeros_like(W), np.zeros_like(b)
for m in range(n_ex):
for i in range(out_rows):
for j in range(out_cols):
for c in range(out_ch):
# compute window boundaries w. stride and dilation
i0, i1 = i * s, (i * s) + fr * (d + 1) - d
j0, j1 = j * s, (j * s) + fc * (d + 1) - d
wc = W[:, :, :, c]
kernel = dZ[m, i, j, c]
window = X_pad[m, i0 : i1 : (d + 1), j0 : j1 : (d + 1), :]
dB[:, :, :, c] += kernel
dW[:, :, :, c] += window * kernel
dX[m, i0 : i1 : (d + 1), j0 : j1 : (d + 1), :] += (
wc * kernel
)
if retain_grads:
self.gradients["W"] += dW
self.gradients["b"] += dB
pr2 = None if pr2 == 0 else -pr2
pc2 = None if pc2 == 0 else -pc2
dXs.append(dX[:, pr1:pr2, pc1:pc2, :])
return dXs[0] if len(Xs) == 1 else dXs
There is no clustering apart from the EM for Gaussian mixtures already in the project. Hence, I would like to implement a kmeans algorithm both the hard clustering version which is common and the soft clustering derivation of the kmeans algorithm. Once I get a go-ahead, then I will proceed to raising a PR within the next few days.
The hard version of K-means will follow the implementation in this slide
The soft version of K-means will also follow the implementation in this slide
I have written up both efficient implementations before checking the contribution guide that specifies that there must be an issue opened. Please give your approval and I will raise the PR right away
The current update
method for the LinearRegression estimator relies on the Sherman-Morrison formula to update the (inverse) covariance matrix for a single new example. This is computationally provident, but limits the use of the method to single-example updates.
It would be great to also accept multiple examples (e.g., arrays of dimension NxM rather than just 1xM) using the Woodbury matrix identity. This will be computationally more expensive as it will require a matrix inversion, but potentially more valuable for certain use-cases.
A lot of linear regression operate in static mode which means that we have to retrain to get the parameters after getting a new data point. This limits the functionality of the linear regression and make working in real-time difficult.
I will implement this online formulation using ideas from the paper attached to this issue. This is based on recursive least square formulation. I have implemented in one of my past project. However, if I get approvals, then I will clean it up and create a new PR
@ddbourgin Have an issue where updates to gradients cannot be performed since shapes conflict during backprop... specifically in the optimizer file.
Error reads:
C[param_name]["mean"] = d1 * mean + (1 - d1) * param_grad
ValueError: operands could not be broadcast together with shapes (100,10) (3072,100)
Model architecture is as follows:
Input -> n_samples, 3072
FC1 -> 3072, 100
FC2 -> 100, 10
The model code is as follows:
def _build_model(self):
self.model = OrderedDict()
self.model['fc1'] = FullyConnected(n_out=self.layers[0],
act_fn=ReLU(),
init=self.initializer,
optimizer=self.optimizer)
self.model['fc2'] = FullyConnected(n_out=self.layers[1],
act_fn=Affine(slope=1, intercept=0),
init=self.initializer,
optimizer=self.optimizer)
self.model['out'] = Softmax(dim=-1,
optimizer=self.optimizer)
@property
def parameters(self):
return {k: v.parameters for k, v in self.model.items()}
@property
def hyperparameters(self):
return {k: v.hyperparameters for k, v in self.model.items()}
@property
def derived_variables(self):
return {k: v.derived_variables for k, v in self.model.items()}
@property
def gradients(self):
return {k: v.gradients for k, v in self.model.items()}
def forward(self, x):
out = x
for k, v in self.model.items():
out = v.forward(out)
return out
def backward(self, y, y_pred):
"""Compute dLdy and then backprop through the layers in self.model"""
dY_pred = self.loss.grad(y, y_pred)
for k, v in reversed(list(self.model.items())):
dY_pred = v.backward(dY_pred)
self._dv['d' + k] = dY_pred
return dY_pred
def update(self, cur_loss):
"""Perform gradient updates"""
for k, v in reversed(list(self.model.items())):
v.update(cur_loss)
self.flush_gradients()
Hoping we can fix this and also create an example for people to follow. Thanks
Hello!
I found an AI-Specific Code smell in your project.
The smell is called: Columns and DataType Not Explicitly Set
You can find more information about it in this paper: https://dl.acm.org/doi/abs/10.1145/3522664.3528620.
According to the paper, the smell is described as follows:
Problem | If the columns are not selected explicitly, it is not easy for developers to know what to expect in the downstream data schema. If the datatype is not set explicitly, it may silently continue the next step even though the input is unexpected, which may cause errors later. The same applies to other data importing scenarios. |
---|---|
Solution | It is recommended to set the columns and DataType explicitly in data processing. |
Impact | Readability |
Example:
### Pandas Column Selection
import pandas as pd
df = pd.read_csv('data.csv')
+ df = df[['col1', 'col2', 'col3']]
### Pandas Set DataType
import pandas as pd
- df = pd.read_csv('data.csv')
+ df = pd.read_csv('data.csv', dtype={'col1': 'str', 'col2': 'int', 'col3': 'float'})
You can find the code related to this smell in this link:
numpy-ml/numpy_ml/rl_models/rl_utils.py
Lines 218 to 238 in 4f37707
I also found instances of this smell in other files, such as:
.
I hope this information is helpful!
I try to implement the loss function, check the pull request for details.
Please check the comment of class.
TODO.
Compare with scipy.cosine
as gold.
TODO.
Hi ddbourgin, I learned a lot from your repo, thanks.
When I read tree / dt.py
script, I have some questions for you.
Code:
if self.depth >= self.max_depth:
v = np.mean(Y, axis=0)
if self.classifier:
v = np.bincount(Y, minlength=self.n_classes) / len(Y)
return Leaf(v)
N, M = X.shape
self.depth += 1
feat_idxs = np.random.choice(M, self.n_feats, replace=False)
# greedily select the best split according to `criterion`
feat, thresh = self._segment(X, Y, feat_idxs)
l = np.argwhere(X[:, feat] <= thresh).flatten()
r = np.argwhere(X[:, feat] > thresh).flatten()
# grow the children that result from the split
left = self._grow(X[l, :], Y[l])
right = self._grow(X[r, :], Y[r])
return Node(left, right, (feat, thresh))
When left
meets the termination condition and becomes Leaf
, self.depth
will reach self.max_depth
, so the right
will always be Leaf
. In this tree, the right node of each layer is a Leaf ,is correct ?
In this case, there are still many samples on the right node
that will no longer be split,
I think this is a problem and will lead to less precise results.
Looking forward to your reply
I am learning the implement of BayesianLinearRegression through numpy-ml project
I copy the code here
class BayesianLinearRegressionKnownVariance:
def __init__(self, b_mean=0, b_sigma=1, b_V=None, fit_intercept=True):
r"""
Bayesian linear regression model with known error variance and
conjugate Gaussian prior on model parameters.
Notes
-----
Uses a conjugate Gaussian prior on the model coefficients. The
posterior over model parameters is
.. math::
b \mid b_{mean}, \sigma^2, b_V \sim \mathcal{N}(b_{mean}, \sigma^2 b_V)
Ridge regression is a special case of this model where :math:`b_{mean}`
= 0, :math:`\sigma` = 1 and `b_V` = I (ie., the prior on `b` is a
zero-mean, unit covariance Gaussian).
Parameters
----------
b_mean : :py:class:`ndarray <numpy.ndarray>` of shape `(M,)` or float
The mean of the Gaussian prior on `b`. If a float, assume `b_mean` is
``np.ones(M) * b_mean``. Default is 0.
b_sigma : float
A scaling term for covariance of the Gaussian prior on `b`. Default
is 1.
b_V : :py:class:`ndarray <numpy.ndarray>` of shape `(N,N)` or `(N,)` or None
A symmetric positive definite matrix that when multiplied
element-wise by `b_sigma^2` gives the covariance matrix for the
Gaussian prior on `b`. If a list, assume ``b_V = diag(b_V)``. If None,
assume `b_V` is the identity matrix. Default is None.
fit_intercept : bool
Whether to fit an intercept term in addition to the coefficients in
b. If True, the estimates for b will have `M + 1` dimensions, where
the first dimension corresponds to the intercept. Default is True.
"""
# this is a placeholder until we know the dimensions of X
b_V = 1.0 if b_V is None else b_V
if isinstance(b_V, list):
b_V = np.array(b_V)
if isinstance(b_V, np.ndarray):
if b_V.ndim == 1:
b_V = np.diag(b_V)
elif b_V.ndim == 2:
fstr = "b_V must be symmetric positive definite"
assert is_symmetric_positive_definite(b_V), fstr
self.posterior = {}
self.posterior_predictive = {}
self.b_V = b_V
self.b_mean = b_mean
self.b_sigma = b_sigma
self.fit_intercept = fit_intercept
def fit(self, X, y):
"""
Compute the posterior over model parameters using the data in `X` and
`y`.
Parameters
----------
X : :py:class:`ndarray <numpy.ndarray>` of shape `(N, M)`
A dataset consisting of `N` examples, each of dimension `M`.
y : :py:class:`ndarray <numpy.ndarray>` of shape `(N, K)`
The targets for each of the `N` examples in `X`, where each target
has dimension `K`.
"""
# convert X to a design matrix if we're fitting an intercept
if self.fit_intercept:
X = np.c_[np.ones(X.shape[0]), X]
N, M = X.shape
self.X, self.y = X, y
if is_number(self.b_V):
self.b_V *= np.eye(M)
if is_number(self.b_mean):
self.b_mean *= np.ones(M)
b_V = self.b_V
b_mean = self.b_mean
b_sigma = self.b_sigma
b_V_inv = np.linalg.inv(b_V)
L = np.linalg.inv(b_V_inv + X.T @ X)
R = b_V_inv @ b_mean + X.T @ y
# (b_v^{-1} + X^{\top}X)^{-1} @ (b_v^{-1}@b_mean + X^{\top}y)
mu = L @ R
cov = L * b_sigma ** 2
# posterior distribution over b conditioned on b_sigma
self.posterior["b"] = {"dist": "Gaussian", "mu": mu, "cov": cov}
latex doesn't display well in github , so I pasted a picture
Also can look to this https://stats.stackexchange.com/questions/477618/understand-the-implement-detail-of-bayesianlinearregression-in-python
We can implement Regularizer
for each layer. The regularizers are applied on a per-layer basis. Here are 3 things that we can do:
I will start a PR to solve 1 and 3. The rest part should be discussed before coding.
System information
Describe the current behavior
Forward pass implementation of LSTM is incorrect
Describe the expected behavior
Code is currently as below. This forget and update gate biases have typos
# compute the input to the gate functions at timestep t
_Go = Zt @ Wo + bo
_Gf = Zt @ Wf + bo
_Gu = Zt @ Wu + bo
_Gc = Zt @ Wc + bc
Code to reproduce the issue
Other info / logs
could we have a faster implementation of max-pooling the naive implementation is just not cutting it?
I even tried to reuse im2col to improve performance but didn't have any luck. The only option I had was to jit it.
in the class "DotProductAttention(LayerBase):",I find the code:
def _fwd(self, Q, K, V):
"""Actual computation of forward pass"""
scale = 1 / np.sqrt(Q.shape[-1]) if self.scale else 1
scores = Q @ K.swapaxes(-2, -1) * scale # attention scores
weights = self.softmax.forward(scores) # attention weights
Y = weights @ V
return Y, weights
The "@" is a arithmetic symbol in python?
Hi, I think there is a little bug at numpy-ml/numpy_ml/neural_nets/activations/activations.py Line 64.
your code
fn_x = self.fn_x
but
self.fn_x
Never defined
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.