ddbourgin / numpy-ml Goto Github PK

Machine learning, in numpy

Home Page: https://numpy-ml.readthedocs.io/

License: GNU General Public License v3.0

Python 100.00%

machine-learning neural-networks topic-modeling gaussian-mixture-models hidden-markov-models gradient-boosting bayesian-inference wavenet vae resnet

numpy-ml's People

Contributors

Stargazers

Watchers

Forkers

zuomatthew knmn2000 casuallybettor marcelorodriguesss wrongwhp drag97 elepherai yueyedeai chenhh0 linan-1990 hdmxz430 dudugang tianjingsteve123 thu1911 hi-yan yangxi49 williamyzd yiwenfu-art jiujiangluck ashora chinarefers xiangnanyue veritasxu alchemistlee sytshanli lhz-97 mouxingyang xiaolouge123 yanshihu offthewallace lyrics-wangkl jsun57 dingqunfei mumu-peng vitamin-github lrbachtiar wgs666 badgergy 993226855 dexterguo kviccn psyche11 bookesse martinscabin yylcandy jaycicle cdj0311 joeblack22 zleq lost723526 vslm698-p energy1010 wymingming heisenberg106 jeffreyhoa yanghainan dangxiaobin123 jiechen99 cucrui xbr2017 haiwencn wst-casd caprileo scholarboss lc2313445 wangdf62 184132335 xksh choetin raykali kevin3lee thomascatlee sunriver binaoye ltmeng12 clerfly demondi chunleiml fulianglee maxime2046 shuyiliu renfeiguo daleiming madehong garfield129 littlefish03 baiyuanxiang jasonwanghk kcikeerf liono1 freeway51545 jun3970 neural-finance xiulonghan juingzhou sernger griffithkq swfxliyiyu zhangadrian apacheguo

numpy-ml's Issues

Possibly bug in decision tree

I was trying to run decision tree model using the very simple Iris dataset and I found a possible bug:

In trees/dt.py, line 160:

The problem is when the array "levels" only has one element, the threshold will be empty, and it will crush on line 164 when we call "gain.max()".

Please correct me if I'm wrong. Thank you for creating this useful repo!

I want to try to apply vector operations instead of cycles, which gives a performance boost. Are such improvements accepted? For example file 'gmm.py' function 'likelihood_lower_bound'. If so, is it possible to get a sample of the data that is fed into the function?

no 'load_dataset' in numpy_mL

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 11
Python version: 3.10
NumPy version: 1.22

Describe the current behavior
AttributeError: module 'numpy_ml' has no attribute 'load_dataset'
Describe the expected behavior

Code to reproduce the issue

import numpy_ml as npml

load dataset

data = npml.load_dataset("data.csv")

Other info / logs

AttributeError Traceback (most recent call last)
Cell In [8], line 4
1 import numpy_ml as npml
3 # load dataset
----> 4 data = npml.load_dataset("data.csv")

AttributeError: module 'numpy_ml' has no attribute 'load_dataset'

so, maybe from some verison of the Numpy-ML, the attibute 'load_dataset' discontinued? but how should I replace this fuction to load data, and what the requirement for the dataset?

many thanks.

Feature request: save/load model to/from json

To easily use these models in a robust non-version-dependent way, an option to save/load model to/from json (either string or file) would be helpful.

E.g.

# save to json (can be stored anywhere, not just in file)
model_json = model.save_to_json()

# save to json file
with open(...) as f:
    model.save_to_json_file(f)   # create folder and parent folders if not exist

# load from json
model = Model.load_from_json(json)

# load from json file
with open(...) as f:
    Model.load_from_json_file(f)

Declare your version of modules

Could you update the requirements.txt to specify the version of each module?

Do we have any graph embedding models?

like node2vec, HOPE, graph2vec

Installation not documented; couldn't find PyPi package or run tests

This is an awesome library, thanks @ddbourgin!!

Users might not know the best way to install this package and try it out. (I didn't, so I eventually just copied the source files.)
Neither the readme nor readthedocs have install instructions.

I couldn't find it on PyPi or Anaconda, and there doesn't appear to be a pyproject.toml, setup.cfg, setup.py, or conda recipe.

Moreover, the tests aren't in a standard path like tests/.
This is uncommon and therefore confusion, and it makes it harder to run them.
Edit: I wasn't expecting them under the source, so I initially wrote that I couldn't find them.

I think it would be great to document how to install numpy-ml, and run its tests & see them to clarify the behavior of some of the functions.

There are some great build and CI tools for Python available, which I recently learned how to use effectively. I'm happy to make a pull request if it would be helpful.

Best choice for my use case?

G'day, how's it going?

I've just started looking into machine learning stuff, and stumbled upon this, looks awesome!

I just want to know what kind of methods I should use for the following:

Text identification (Spam checker for example)
Image analysis (Detects whether the image given after training is a male or female)

Kind regards,

Machine-Learning newbie, Mitch!

Bug in initializers init_from_dict()

System information

OS Platform and Distribution: Linux Ubuntu 18.04
Python version: 3.6.9
NumPy version: 1.19.5

Describe the current behavior
When calling the init_from_dict() method from numpy_ml/neural_nets/initializers, from both SchedulerInitializer and OptimizerInitializer classes, the returned object is None, rather than a propper object.
This is caused by the assignation of the set_params() method to the returned object. Such a method does not return an object but modifies the instance itself.

numpy-ml/numpy_ml/neural_nets/initializers/initializers.py

Lines 176 to 194 in b537fac

 def init_from_dict(self): 

 O = self.param 

 cc = O["cache"] if "cache" in O else None 

 op = O["hyperparameters"] if "hyperparameters" in O else None 

 if op is None: 

 raise ValueError("Must have `hyperparemeters` key: {}".format(O)) 

 if op and op["id"] == "SGD": 

 optimizer = SGD().set_params(op, cc) 

 elif op and op["id"] == "RMSProp": 

 optimizer = RMSProp().set_params(op, cc) 

 elif op and op["id"] == "AdaGrad": 

 optimizer = AdaGrad().set_params(op, cc) 

 elif op and op["id"] == "Adam": 

 optimizer = Adam().set_params(op, cc) 

 elif op: 

 raise NotImplementedError("{}".format(op["id"])) 

 return optimizer

Describe the expected behavior
init_from_dict() should return a propper object that will be assigned to an attribute of a NN layer.

Code to reproduce the issue

from numpy_ml.neural_nets.layers import *

c1 = Conv2D(6, (3,3))
opt = c1.hyperparameters['optimizer'] # dict

c2=Conv2D(6, (3,3), optimizer=opt) # The optimizer is set to None
c2.hyperparameters

This raises AttributeError: 'NoneType' object has no attribute 'cache'.
The same happens with Scheduler.

Can I write a K-means model? then pull request.

I can't find K-means model, so I think I can coding one. Thanks!

Import of collections.Hashable fails in Python 3.10

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS 12.6
Python version: 3.10.7
NumPy version: 1.23.3

Describe the current behavior
import of numpy_ml fails due to an ImportError with the collections module.

Describe the expected behavior
Importing just the module should not generate an ImportError

Code to reproduce the issue

# Python 3.10 or newer
import numpy_ml

Other info / logs
In Python 3.10 the deprecated aliases were removed.

Remove deprecated aliases to Collections Abstract Base Classes from the collections module. (Contributed by Victor Stinner in bpo-37324.)

from What’s New In Python 3.10

fix
To fix the bug, in /numpy_ml/utils/data_structures.py change

from collections import Hashable

from collections.abc import Hashable

Automatic diferentiation for neural networks

I write in Spanish, it is more handy to me. Now, everybody have got an online translator within the browser.

He revisado el código, no solo la documentación. Tu trabajo es muy valioso y de calidad, enhorabuena. En relación a redes neuronales, la ausencia de ejemplos ha sido una dificultad. Solo pude usar como ejemplos lo que hay en numpy_ml/neural_nets/models.

Sobre redes neuronales, la principal carencia es la diferenciación automática, lo que obliga a codificar un metodo de backpropagation por cada tipo de capa o módulo. Además impide el uso de funciones de coste personalizadas, cuya derivada no tiene una expresión analítica conocida.

Creo que ya estás trabajando en otros temas, creo que el repositorio no está mantenido. Si alguna vez se vuelve a activar este repositorio o alguien hace fork, sugiero que la parte de redes neuronales incluya diferenciación automatica.

activations.py optimizations

Taking a quick look, some of the grad and grad2 functions might benefit from some optimizations. Here's on example:

numpy-ml/numpy_ml/neural_nets/activations/activations.py

Lines 38 to 39 in fce2acf

 def grad(self, x): 

 return self.fn(x) * (1 - self.fn(x))

Here the function could be changed such that fn(x) is only computed once:

def grad(self, x):
    fn_x = self.fn(x)
    return fn_x * (1 - fn_x)

The extra mem used to store the calculation should be immediately collected after the function ends so that shouldn't be a problem. Would love a second opinion @ddbourgin before making a PR with the necessary changes.

How to use for word2vec training?

Sorry for the newbie question, but I'm having a bit of trouble in trying to use the library.

Specifically what I'm trying to do is embedding training, like word2vec. So I am trying to setup and embedding matrix, loss, and optimization using Adam.

Any pointers would be greatly appreciated.

implementation of Lasso regression

I don't see any lasso regression model in linear models.Can i implement the lasso regression model?

neural_nets/utils/utils.py line 797 has a bug!

hi, I think neural_nets/utils/utils.py line 797 has a bug!

your code

i0, i1 = i * s, (i * s) + fr * (d + 1) - d
j0, j1 = j * s, (j * s) + fc * (d + 1) - d

right code

i0, i1 = i * s, (i * s) + fr
j0, j1 = j * s, (j * s) + fc

because fr and fc are already dilated size !

Yet another approach to implement Net module with numpy for vanilla neural network

Strengthen your network layers operator implementation

Hi, I just found this repo and I found that the network implementation is weak. Here I provide your some examples where you expand neural network implementation with tests covered:

Detailed explanation with Conv, BatchNorm, Relu, FullyConnected, UpSampling:
https://github.com/yiakwy/yiakwy.github.io/blob/master/Correlation%20Metrics/ConvNet/costImpl/ConvNet/layers.py

Support keras alike layers stacking syntax:

Unit test covered:
https://github.com/yiakwy/yiakwy.github.io/blob/master/Correlation%20Metrics/ConvNet/READMME.md

This implementation is aimed to give people the best understand of some operators like Upsampling.

Nowadays, many people has already understand how to implement operators in CPU devices. We call it vanilla implementation. It is still challenging to understand how to implement and optimize them in different devices.

When we talk about Neural Network as a Computing Graph, we are actually interested in how to implement operations planned in different devices. At least, I need to provide FLops needed in each operator as edge weight and understand how to implement them and distribute them in different devices.

Naive Bayes

Hi, I am thinking about implementing Naive Bayes Methods and make a pull request. But I am unsure about the Unit testing part, Should I compare the performance with the ScikitLearn library?

Check if trainable

If conv is not trainable you must check and not update "W".

numpy-ml/numpy_ml/neural_nets/layers/layers.py

Line 2488 in 4f37707

self.parameters = {"W": W, "b": b}

this repo is too necessary / too good

eom

covariance update for multi dimension capability

I think this line:
outer = np.zeros((2, 2))
should change to
outer = np.zeros((self.d, self.d))

There is no CRF here? Why

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
Python version:
NumPy version:

Describe the current behavior

Describe the expected behavior

Code to reproduce the issue

Other info / logs

LogisticRegression regularization loss not right

Based on my understanding of l1 and l2 loss should be:
d_penalty = gamma * np.square(beta).sum() if p == "l2" else gamma * np.abs(beta).sum()
instead of :
d_penalty = gamma * beta if p == "l2" else gamma * np.sign(beta)

numpy-ml/linear_models/lm.py

Line 171 in 165ad88

d_penalty = gamma * beta if p == "l2" else gamma * np.sign(beta)

sourse : https://towardsdatascience.com/l1-and-l2-regularization-methods-ce25e7fc831c
Please double check!
Thanks

More advanced activation functions.

I will try to implement rest part of activation functions. Any better ideas for activation function?

[Question] Why `gamma * beta` stand for ` L2 in LogisticRegression._NLL_grad

Hello, this is a great project , I am learning how to implement model without sklearn/tensorflow , it really help me a lot .

I have a question on

numpy-ml/numpy_ml/linear_models/lm.py

Line 252 in 4f37707

d_penalty = gamma * beta if p == "l2" else gamma * l1norm(beta) * np.sign(beta)

Since P-norm is defined as

l1norms(self.beta) means the sum of all absulote value of each element in self.beta . I don't quite understand why the simple gamma * beta stand for `L2 ?

PS: May I ask what IDE and code document plugin you are using ? I see some annotation don't beyond to latex , it would be nice to see beautiful math symbols than raw latex :)

Hidden Topic Markov Model

It would be good to have HTMM implementation here like https://github.com/Charleo85/pyhtmm

Bug in transfer learning

System information

OS Platform and Distribution: Linux Ubuntu 18.04
Python version: 3.6.9
NumPy version: 1.19.5

Describe the current behavior
There is a problem when trying to perform simple transfer learning techniques (loading the same parameters from another trained layers/models).
When setting layer params with a layer summary dictionary (generated with the summary() method), the activation function can be overridden with a string due to the non-exclusive if-clauses:

numpy-ml/numpy_ml/neural_nets/layers/layers.py

Lines 119 to 127 in b537fac

 if k in self.hyperparameters: 

 if k == "act_fn": 

 layer.act_fn = ActivationInitializer(v)() 

 if k == "optimizer": 

 layer.optimizer = OptimizerInitializer(sd[k])() 

 if k not in ["wrappers", "optimizer"]: 

 setattr(layer, k, v) 

 if k == "wrappers": 

 layer = init_wrappers(layer, sd[k])

This causes an error when trying to call the layer activation function in the forward() method.

Describe the expected behavior
Layers that get their parameters with the set_params() method should behave without errors.

Code to reproduce the issue

>>> import numpy as np
>>> from numpy_ml.neural_nets.layers import *
>>>
>>> c1 = Conv2D(6, (3,3))
>>> c2 = Conv2D(6, (3,3))
>>> x = np.random.randn(1, 32, 32, 3)
>>>
>>> y1 = c1.forward(x)
>>> y2 = c2.forward(x) # No problem here
>>>
>>> c2.set_params(c1.summary()) # The act_fn of c2 is overridden as a str
<numpy_ml.neural_nets.layers.layers.Conv2D object at 0x7f0bf9d405f8>
>>> y3 = c2.forward(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ramuri01/numpy-ml/numpy_ml/neural_nets/layers/layers.py", line 2822, in forward
    Y = self.act_fn(Z)
TypeError: 'str' object is not callable

[Question] Gradient of Gradient Penalty in WGAN-GP.

Hello! I have a question regarding the implementation of gradient function of WGAN-GP (https://github.com/ddbourgin/numpy-ml/blob/master/numpy_ml/neural_nets/losses/losses.py#L497). I'm not sure why epsilon is added to X_interp_norm. I'm getting the same gradient except for the epsilon term. Also, the gradient is computed with respect to GradInterp , shouldn't the gradient be computed with respect to the mixed image x'?

I could be wrong about some of this. Looking forward to hearing from you.

`numpy_ml.linear_model.LinearRegression.predict()` generates `ValueError` when used with copy-pasted code, but pip installed version works as expected!!

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
Python version: 3.7.12
NumPy version: 1.21.5
(environment is Google Colab on 20-Mar, 2022.)

Describe the current behavior
I have copy-pasted the code for numpy_ml.linear_model.LinearRegression from github and did .fit() and .predict() on some dummy data. I got ValueError on .predict() like this:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-10-4be896198177>](https://localhost:8080/#) in <module>()
----> 1 npml_lin_reg2_preds = npml_lin_reg2.predict(X_val)
      2 npml_lin_reg2_preds[:10]

[<ipython-input-8-fc521849e158>](https://localhost:8080/#) in predict(self, X)
    206         if self.fit_intercept:
    207             X = np.c_[np.ones(X.shape[0]), X]
--> 208         return X @ self.beta

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 11)

Describe the expected behavior
Expected behaviour is that .predict() doesn't generate ValueError.

Code to reproduce the issue
not code, here is the link to the notebook: https://colab.research.google.com/drive/12q9r2j4-UpUrPnzvMiPC6rxafa73cY5L?usp=sharing

Other info / logs

Docs: add Community profile

Issue Template
Pull Request Template
Code of Conduct

Usage to build CNN Network

Is there any documentation for usage to build a network?

I want to try to implement some simple network based on for example MNIST dataset.

If there is no documentation, i think we can write one. For example, in keras, we can have model built like this:

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))

Undefined names in Python code

flake8 testing of https://github.com/ddbourgin/numpy-ml on Python 3.7.1

$ flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics

./numpy_ml/rl_models/trainer.py:75:61: F821 undefined name 'smooth_tot'
            smooth_tot = tot_rwd if ep == 0 else (1 - sf) * smooth_tot + sf * tot_rwd
                                                            ^
./numpy_ml/neural_nets/models/w2v.py:303:38: F821 undefined name 'smooth_loss'
                smooth_loss = 0.99 * smooth_loss + 0.01 * loss if ix > 0 else loss
                                     ^
./numpy_ml/neural_nets/layers/layers.py:1845:14: F632 use ==/!= to compare str, bytes, and int literals
        elif self.pool is "sum":
             ^
./numpy_ml/neural_nets/layers/layers.py:1848:14: F632 use ==/!= to compare str, bytes, and int literals
        elif self.pool is "mean":
             ^
2     F632 use ==/!= to compare str, bytes, and int literals
2     F821 undefined name 'smooth_loss'
4

E901,E999,F821,F822,F823 are the "showstopper" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. These 5 are different from most other flake8 issues which are merely "style violations" -- useful for readability but they do not effect runtime safety.

F821: undefined name name
F822: undefined name name in __all__
F823: local variable name referenced before assignment
E901: SyntaxError or IndentationError
E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree

Using numpy.tensordot for Conv2D

From this link:
https://stackoverflow.com/questions/56085669/convolutional-layer-in-python-using-numpy

and

https://numpy.org/doc/stable/reference/generated/numpy.tensordot.html

Z = np.tensordot(X_pad, weights, axes=3) + self.bias

Does this function is more relevant that using im2col?

Example of MLP architecture

Thank you for this package. I'm looking for some example on how to implement simple MLP (Multi Layer Perceptron) with this package. Any code snippets or tutorials are welcome.

Below is some code that I glue, but I have no idea on how to do backpropagation, I would like to have fit() method implemented.

Thank you!

from numpy_ml.neural_nets.losses import CrossEntropy, SquaredError
from numpy_ml.neural_nets.utils import minibatch
from numpy_ml.neural_nets.activations import ReLU, Sigmoid
from numpy_ml.neural_nets.layers import FullyConnected
from numpy_ml.neural_nets.optimizers.optimizers import SGD

optimizer = SGD()
loss = SquaredError()

class MLP:

    def __init__(self):
        self.nn = OrderedDict()
        self.nn["L1"] = FullyConnected(
            10, act_fn="ReLU", optimizer=optimizer
        )
        self.nn["L2"] = FullyConnected(
            1, act_fn="Sigmoid", optimizer=optimizer
        )

    def forward(self, X, retain_derived=True):
        Xs = {}
        out, rd = X, retain_derived
        for k, v in self.nn.items():
            Xs[k] = out
            out = v.forward(out, retain_derived=rd)
        return out, Xs
        
    def backward(self, grad, retain_grads=True):
        dXs = {}
        out, rg = grad, retain_grads
        for k, v in reversed(list(self.nn.items())):
            dXs[k] = out
            out = v.backward(out, retain_grads=rg)
        return out, dXs

error in DecisionTree

data = pd.read_csv('Data/Bankloan.csv', sep=';')
for i in ['debtinc', 'creddebt', 'othdebt']:
data[i] = data[i].str.replace(',', '.').astype('float')
train, test, y_train, y_test = train_test_split(data.drop('default', axis=1),
data['default'],
test_size=0.3,
stratify=data['default'],
random_state=42)
X_train = pd.get_dummies(train)
X_test = pd.get_dummies(test)
tree = DecisionTree(seed=42, max_depth=4, n_feats=2)
tree.fit(X_train.values, y_train.values)

ValueError Traceback (most recent call last)
in
1 tree = DecisionTree(seed=42, max_depth=4, n_feats=2)
----> 2 tree.fit(X_train.values, y_train.values)

in fit(self, X, Y)
78 self.n_classes = max(Y) + 1 if self.classifier else None
79 self.n_feats = X.shape[1] if not self.n_feats else min(self.n_feats, X.shape[1])
---> 80 self.root = self._grow(X, Y)
81
82 def predict(self, X):

in _grow(self, X, Y, cur_depth)
138
139 # grow the children that result from the split
--> 140 left = self._grow(X[l, :], Y[l], cur_depth)
141 right = self._grow(X[r, :], Y[r], cur_depth)
142 return Node(left, right, (feat, thresh))

in _grow(self, X, Y, cur_depth)
139 # grow the children that result from the split
140 left = self._grow(X[l, :], Y[l], cur_depth)
--> 141 right = self._grow(X[r, :], Y[r], cur_depth)
142 return Node(left, right, (feat, thresh))
143

in _grow(self, X, Y, cur_depth)
133
134 # greedily select the best split according to criterion
--> 135 feat, thresh = self._segment(X, Y, feat_idxs)
136 l = np.argwhere(X[:, feat] <= thresh).flatten()
137 r = np.argwhere(X[:, feat] > thresh).flatten()

in _segment(self, X, Y, feat_idxs)
155 gains = np.array([self._impurity_gain(Y, t, vals) for t in thresholds])
156
--> 157 if gains.max() > best_gain:
158 split_idx = i
159 best_gain = gains.max()

/anaconda3/lib/python3.7/site-packages/numpy/core/_methods.py in _amax(a, axis, out, keepdims, initial, where)
28 def _amax(a, axis=None, out=None, keepdims=False,
29 initial=_NoValue, where=True):
---> 30 return umr_maximum(a, axis, None, out, keepdims, initial, where)
31
32 def _amin(a, axis=None, out=None, keepdims=False,

ValueError: zero-size array to reduction operation maximum which has no identity

Link to dataset https://drive.google.com/file/d/1lj7qUyG7BOV6cAGm8-tDNUqS62IEgk5p/view?usp=sharing

is the weights need to be rotated?

Thanks for running this project ! , I am a beginner , there some issue I didn't understand that the function of _backward_naive in Conv2D does not flip the weights, am I missing something

def _backward_naive(self, dLdy, retain_grads=True):
        assert self.trainable, "Layer is frozen"
        if not isinstance(dLdy, list):
            dLdy = [dLdy]

        W = self.parameters["W"]
        b = self.parameters["b"]
        Zs = self.derived_variables["Z"]

        Xs, d = self.X, self.dilation
        (fr, fc), s, p = self.kernel_shape, self.stride, self.pad

        dXs = []
        for X, Z, dy in zip(Xs, Zs, dLdy):
            n_ex, out_rows, out_cols, out_ch = dy.shape
            X_pad, (pr1, pr2, pc1, pc2) = pad2D(X, p, self.kernel_shape, s, d)

            dZ = dLdy * self.act_fn.grad(Z)

            dX = np.zeros_like(X_pad)
            dW, dB = np.zeros_like(W), np.zeros_like(b)
            for m in range(n_ex):
                for i in range(out_rows):
                    for j in range(out_cols):
                        for c in range(out_ch):
                            # compute window boundaries w. stride and dilation
                            i0, i1 = i * s, (i * s) + fr * (d + 1) - d
                            j0, j1 = j * s, (j * s) + fc * (d + 1) - d

                            wc = W[:, :, :, c]
                            kernel = dZ[m, i, j, c]
                            window = X_pad[m, i0 : i1 : (d + 1), j0 : j1 : (d + 1), :]

                            dB[:, :, :, c] += kernel
                            dW[:, :, :, c] += window * kernel
                            dX[m, i0 : i1 : (d + 1), j0 : j1 : (d + 1), :] += (
                                wc * kernel
                            )

            if retain_grads:
                self.gradients["W"] += dW
                self.gradients["b"] += dB

            pr2 = None if pr2 == 0 else -pr2
            pc2 = None if pc2 == 0 else -pc2
            dXs.append(dX[:, pr1:pr2, pc1:pc2, :])
        return dXs[0] if len(Xs) == 1 else dXs

Feature Request: Clustering Kmeans (hard and soft version)

There is no clustering apart from the EM for Gaussian mixtures already in the project. Hence, I would like to implement a kmeans algorithm both the hard clustering version which is common and the soft clustering derivation of the kmeans algorithm. Once I get a go-ahead, then I will proceed to raising a PR within the next few days.

The hard version of K-means will follow the implementation in this slide

The soft version of K-means will also follow the implementation in this slide

I have written up both efficient implementations before checking the contribution guide that specifies that there must be an issue opened. Please give your approval and I will raise the PR right away

Feature request: Accept multiple samples for online least squares

The current update method for the LinearRegression estimator relies on the Sherman-Morrison formula to update the (inverse) covariance matrix for a single new example. This is computationally provident, but limits the use of the method to single-example updates.

It would be great to also accept multiple examples (e.g., arrays of dimension NxM rather than just 1xM) using the Woodbury matrix identity. This will be computationally more expensive as it will require a matrix inversion, but potentially more valuable for certain use-cases.

Feature Request: Online Linear Regression

A lot of linear regression operate in static mode which means that we have to retrain to get the parameters after getting a new data point. This limits the functionality of the linear regression and make working in real-time difficult.
I will implement this online formulation using ideas from the paper attached to this issue. This is based on recursive least square formulation. I have implemented in one of my past project. However, if I get approvals, then I will clean it up and create a new PR

rls.pdf

neural nets optimizer shape mismatch during backward pass

@ddbourgin Have an issue where updates to gradients cannot be performed since shapes conflict during backprop... specifically in the optimizer file.

Error reads:

C[param_name]["mean"] = d1 * mean + (1 - d1) * param_grad
ValueError: operands could not be broadcast together with shapes (100,10) (3072,100)

Model architecture is as follows:

Input -> n_samples, 3072
FC1 -> 3072, 100
FC2 -> 100, 10

The model code is as follows:

def _build_model(self):
    self.model = OrderedDict()
    self.model['fc1'] = FullyConnected(n_out=self.layers[0],
                                       act_fn=ReLU(),
                                       init=self.initializer,
                                       optimizer=self.optimizer)


    self.model['fc2'] = FullyConnected(n_out=self.layers[1],
                                       act_fn=Affine(slope=1, intercept=0),
                                       init=self.initializer,
                                       optimizer=self.optimizer)


    self.model['out'] = Softmax(dim=-1,
                                optimizer=self.optimizer)

@property
def parameters(self):
    return {k: v.parameters for k, v in self.model.items()}

@property
def hyperparameters(self):
    return {k: v.hyperparameters for k, v in self.model.items()}

@property
def derived_variables(self):
    return {k: v.derived_variables for k, v in self.model.items()}

@property
def gradients(self):
    return {k: v.gradients for k, v in self.model.items()}

def forward(self, x):
    out = x
    for k, v in self.model.items():
        out = v.forward(out)
    return out

def backward(self, y, y_pred):
    """Compute dLdy and then backprop through the layers in self.model"""
    dY_pred = self.loss.grad(y, y_pred)
    for k, v in reversed(list(self.model.items())):
        dY_pred = v.backward(dY_pred)
        self._dv['d' + k] = dY_pred
    return dY_pred

def update(self, cur_loss):
    """Perform gradient updates"""
    for k, v in reversed(list(self.model.items())):
        v.update(cur_loss)
    self.flush_gradients()

Hoping we can fix this and also create an example for people to follow. Thanks

Columns and DataType Not Explicitly Set on line 228 of rl_utils.py

Hello!

I found an AI-Specific Code smell in your project.
The smell is called: Columns and DataType Not Explicitly Set

You can find more information about it in this paper: https://dl.acm.org/doi/abs/10.1145/3522664.3528620.

According to the paper, the smell is described as follows:

Problem	If the columns are not selected explicitly, it is not easy for developers to know what to expect in the downstream data schema. If the datatype is not set explicitly, it may silently continue the next step even though the input is unexpected, which may cause errors later. The same applies to other data importing scenarios.
Solution	It is recommended to set the columns and DataType explicitly in data processing.
Impact	Readability

Example:

### Pandas Column Selection
import pandas as pd
df = pd.read_csv('data.csv')
+ df = df[['col1', 'col2', 'col3']]

### Pandas Set DataType
import pandas as pd
- df = pd.read_csv('data.csv')
+ df = pd.read_csv('data.csv', dtype={'col1': 'str', 'col2': 'int', 'col3': 'float'})

You can find the code related to this smell in this link:

numpy-ml/numpy_ml/rl_models/rl_utils.py

Lines 218 to 238 in 4f37707

 "multidim_actions", 

 "multidim_observations", 

 "n_actions_per_dim", 

 "n_obs_per_dim", 

 "obs_dim", 

 # "obs_ids", 

 "seed", 

 "tuple_actions", 

 "tuple_observations", 

 ] 

 return df if NO_PD else pd.DataFrame(df)[cols] 

 def is_tuple(env): 

 """ 

  Check if the action and observation spaces for `env` are instances of 

  ``gym.spaces.Tuple`` or ``gym.spaces.Dict``. 

  Notes 

  ----- 

  A tuple space is a tuple of *several* (possibly multidimensional)

I also found instances of this smell in other files, such as:

I hope this information is helpful!

Feature: Cosine Proximity Loss Function

I try to implement the loss function, check the pull request for details.

Loss Function

Please check the comment of class.

Loss Function Grad

TODO.

Loss Function Test

Compare with scipy.cosine as gold.

Loss Function Grad Test

TODO.

Question about dt scirpt

Hi ddbourgin, I learned a lot from your repo, thanks.
When I read tree / dt.py script, I have some questions for you.

Code：

if self.depth >= self.max_depth:
    v = np.mean(Y, axis=0)
    if self.classifier:
        v = np.bincount(Y, minlength=self.n_classes) / len(Y)
    return Leaf(v)

N, M = X.shape
self.depth += 1
feat_idxs = np.random.choice(M, self.n_feats, replace=False)

# greedily select the best split according to `criterion`
feat, thresh = self._segment(X, Y, feat_idxs)
l = np.argwhere(X[:, feat] <= thresh).flatten()
r = np.argwhere(X[:, feat] > thresh).flatten()

# grow the children that result from the split
left = self._grow(X[l, :], Y[l])
right = self._grow(X[r, :], Y[r])
return Node(left, right, (feat, thresh))

When left meets the termination condition and becomes Leaf, self.depth will reach self.max_depth, so the right will always be Leaf. In this tree, the right node of each layer is a Leaf ,is correct ?
In this case, there are still many samples on the right node that will no longer be split,
I think this is a problem and will lead to less precise results.

Looking forward to your reply

[Question]How to understand the implement detail of BayesianLinearRegression ? latex updated

I am learning the implement of BayesianLinearRegression through numpy-ml project

I copy the code here

class BayesianLinearRegressionKnownVariance:
    def __init__(self, b_mean=0, b_sigma=1, b_V=None, fit_intercept=True):
        r"""
        Bayesian linear regression model with known error variance and
        conjugate Gaussian prior on model parameters.

        Notes
        -----
        Uses a conjugate Gaussian prior on the model coefficients. The
        posterior over model parameters is

        .. math::

            b \mid b_{mean}, \sigma^2, b_V \sim \mathcal{N}(b_{mean}, \sigma^2 b_V)

        Ridge regression is a special case of this model where :math:`b_{mean}`
        = 0, :math:`\sigma` = 1 and `b_V` = I (ie., the prior on `b` is a
        zero-mean, unit covariance Gaussian).

        Parameters
        ----------
        b_mean : :py:class:`ndarray <numpy.ndarray>` of shape `(M,)` or float
            The mean of the Gaussian prior on `b`. If a float, assume `b_mean` is
            ``np.ones(M) * b_mean``. Default is 0.
        b_sigma : float
            A scaling term for covariance of the Gaussian prior on `b`. Default
            is 1.
        b_V : :py:class:`ndarray <numpy.ndarray>` of shape `(N,N)` or `(N,)` or None
            A symmetric positive definite matrix that when multiplied
            element-wise by `b_sigma^2` gives the covariance matrix for the
            Gaussian prior on `b`. If a list, assume ``b_V = diag(b_V)``. If None,
            assume `b_V` is the identity matrix. Default is None.
        fit_intercept : bool
            Whether to fit an intercept term in addition to the coefficients in
            b. If True, the estimates for b will have `M + 1` dimensions, where
            the first dimension corresponds to the intercept. Default is True.
        """
        # this is a placeholder until we know the dimensions of X
        b_V = 1.0 if b_V is None else b_V

        if isinstance(b_V, list):
            b_V = np.array(b_V)

        if isinstance(b_V, np.ndarray):
            if b_V.ndim == 1:
                b_V = np.diag(b_V)
            elif b_V.ndim == 2:
                fstr = "b_V must be symmetric positive definite"
                assert is_symmetric_positive_definite(b_V), fstr

        self.posterior = {}
        self.posterior_predictive = {}

        self.b_V = b_V
        self.b_mean = b_mean
        self.b_sigma = b_sigma
        self.fit_intercept = fit_intercept

    def fit(self, X, y):
        """
        Compute the posterior over model parameters using the data in `X` and
        `y`.

        Parameters
        ----------
        X : :py:class:`ndarray <numpy.ndarray>` of shape `(N, M)`
            A dataset consisting of `N` examples, each of dimension `M`.
        y : :py:class:`ndarray <numpy.ndarray>` of shape `(N, K)`
            The targets for each of the `N` examples in `X`, where each target
            has dimension `K`.
        """
        # convert X to a design matrix if we're fitting an intercept
        if self.fit_intercept:
            X = np.c_[np.ones(X.shape[0]), X]

        N, M = X.shape
        self.X, self.y = X, y

        if is_number(self.b_V):
            self.b_V *= np.eye(M)

        if is_number(self.b_mean):
            self.b_mean *= np.ones(M)

        b_V = self.b_V
        b_mean = self.b_mean
        b_sigma = self.b_sigma

        b_V_inv = np.linalg.inv(b_V)
        L = np.linalg.inv(b_V_inv + X.T @ X)
        R = b_V_inv @ b_mean + X.T @ y

        # (b_v^{-1} + X^{\top}X)^{-1} @ (b_v^{-1}@b_mean + X^{\top}y)

        mu = L @ R
        cov = L * b_sigma ** 2

        # posterior distribution over b conditioned on b_sigma
        self.posterior["b"] = {"dist": "Gaussian", "mu": mu, "cov": cov}

latex doesn't display well in github , so I pasted a picture

Also can look to this https://stats.stackexchange.com/questions/477618/understand-the-implement-detail-of-bayesianlinearregression-in-python

Feature: L1, L2 Regularizers

We can implement Regularizer for each layer. The regularizers are applied on a per-layer basis. Here are 3 things that we can do:

Regularizer Class.
Layers class to use regularizer. (we can set a new parameter)
Regularizer Documentation.
Regularizer Test.

I will start a PR to solve 1 and 3. The rest part should be discussed before coding.

Error in LSTM Implementation

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
MacOS
Python version:
3.7
NumPy version:
1.17

Describe the current behavior
Forward pass implementation of LSTM is incorrect

Describe the expected behavior
Code is currently as below. This forget and update gate biases have typos

        # compute the input to the gate functions at timestep t
        _Go = Zt @ Wo + bo
        _Gf = Zt @ Wf + bo
        _Gu = Zt @ Wu + bo
        _Gc = Zt @ Wc + bc

Code to reproduce the issue

Other info / logs

max-pooling too slow

could we have a faster implementation of max-pooling the naive implementation is just not cutting it?
I even tried to reuse im2col to improve performance but didn't have any luck. The only option I had was to jit it.

What does arithmetic symbol '@' mean?

in the class "DotProductAttention(LayerBase):",I find the code:

def _fwd(self, Q, K, V):
        """Actual computation of forward pass"""
        scale = 1 / np.sqrt(Q.shape[-1]) if self.scale else 1
        scores = Q @ K.swapaxes(-2, -1) * scale  # attention scores
        weights = self.softmax.forward(scores)  # attention weights
        Y = weights @ V
        return Y, weights

The "@" is a arithmetic symbol in python?

A little bug！

Hi, I think there is a little bug at numpy-ml/numpy_ml/neural_nets/activations/activations.py Line 64.

your code

fn_x = self.fn_x

but

self.fn_x Never defined

	def init_from_dict(self):
	O = self.param
	cc = O["cache"] if "cache" in O else None
	op = O["hyperparameters"] if "hyperparameters" in O else None

	if op is None:
	raise ValueError("Must have `hyperparemeters` key: {}".format(O))

	if op and op["id"] == "SGD":
	optimizer = SGD().set_params(op, cc)
	elif op and op["id"] == "RMSProp":
	optimizer = RMSProp().set_params(op, cc)
	elif op and op["id"] == "AdaGrad":
	optimizer = AdaGrad().set_params(op, cc)
	elif op and op["id"] == "Adam":
	optimizer = Adam().set_params(op, cc)
	elif op:
	raise NotImplementedError("{}".format(op["id"]))
	return optimizer

	if k in self.hyperparameters:
	if k == "act_fn":
	layer.act_fn = ActivationInitializer(v)()
	if k == "optimizer":
	layer.optimizer = OptimizerInitializer(sd[k])()
	if k not in ["wrappers", "optimizer"]:
	setattr(layer, k, v)
	if k == "wrappers":
	layer = init_wrappers(layer, sd[k])

	"multidim_actions",
	"multidim_observations",
	"n_actions_per_dim",
	"n_obs_per_dim",
	"obs_dim",
	# "obs_ids",
	"seed",
	"tuple_actions",
	"tuple_observations",
	]
	return df if NO_PD else pd.DataFrame(df)[cols]


	def is_tuple(env):
	"""
	Check if the action and observation spaces for `env` are instances of
	``gym.spaces.Tuple`` or ``gym.spaces.Dict``.

	Notes
	-----
	A tuple space is a tuple of several (possibly multidimensional)

ddbourgin / numpy-ml Goto Github PK

numpy-ml's People

Contributors

Stargazers

Watchers

Forkers

numpy-ml's Issues

load dataset

your code

right code

Strengthen your network layers operator implementation

Recommend Projects

Recommend Topics

Recommend Org