Git Product home page Git Product logo

deeplearning's Introduction

Deep Learning (Python, C, C++, Java, Scala, Go)

Classes :

  • DBN: Deep Belief Nets

  • CDBN: Deep Belief Nets w/ continuous-valued inputs

  • RBM: Restricted Boltzmann Machine

  • CRBM: Restricted Boltzmann Machine w/ continuous-valued inputs

  • dA: Denoising Autoencoders

  • SdA: Stacked Denoising Autoencoders

  • LogisticRegression: Logistic Regression

  • HiddenLayer: Hidden Layer of Neural Networks

  • MLP: Multiple Layer Perceptron

  • Dropout: Dropout MLP

  • CNN: Convolutional Neural Networks (See dev branch.)

References :

  • Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle: Greedy Layer-Wise Training of Deep Networks, Advances in Neural Information Processing Systems 19, 2007

  • P. Vincent, H. Larochelle, Y. Bengio, P.A. Manzagol: Extracting and Composing Robust Features with Denoising Autoencoders, ICML' 08, 1096-1103, 2008

  • DeepLearningTutorials https://github.com/lisa-lab/DeepLearningTutorials

  • Yusuke Sugomori: Stochastic Gradient Descent for Denoising Autoencoders, http://yusugomori.com/docs/SGD_DA.pdf

Publication :

  • More detailed Java implementations are introduced in my book, Java Deep Learning Essentials.

    The book is available from Packt Publishing or Amazon.

Bug reports / contributions / donations are deeply welcome.

Bitcoin wallet address: 34kZarc2uBU6BMCouUp2iudvZtbmZMPqrA

deeplearning's People

Contributors

yusugomori avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeplearning's Issues

A Problem with the Deep Belief Network C++ codes

I applied your dbn.cpp code . But for your all test data, the prediction results are equal. It is an unexpected situation (if we give a training sample as a train sample it can not find the true class, too). What is the problem? I expect your help.

bug in DBN.cpp

DBN::DBN(int size, int n_i, int *hls, int n_o, int n_l) {
// construct rbm_layer
// DBN-RBM deconstructer bug fixed
rbm_layers[i] = new RBM(N, input_size, hidden_layer_sizes[i],
NULL, NULL, NULL);
//rbm_layers[i] = new RBM(N, input_size, hidden_layer_sizes[i],
// sigmoid_layers[i]->W, sigmoid_layers[i]->b, NULL);
}
}

sigmoid_layers[i]->W, sigmoid_layers[i]->b should be set NULL in DBN constructer
if not, RBM's W member will be sigmoid_layers[i]->W
when RBM deconstruction, there will have issue.

Over fitting in Stacked Denoising Autoencoders

In the finetune phase, the whole dataset is used which will cause over fitting for the model...
So, I think this can be solved by:

  • divide the dataset into training, test & validation sets
  • Or, using a k-fold cross validation

CDBN, Dataset with size of 8000 run into failure

Hi,
I am trying to feed in a dataset with about 8000 instances (8391x8), but it gets an exception in LogisticRegression.train function computing d_y = self.y - p_y_given_x;
the error is
ValueError: operands could not be broadcast together with shapes (8391,2) (8,2)
I suppose using CDBN/CRBM classes we can do a prediction like regression but I don't know why the p_y_given_x size is not matching self.y which suppose to do.
Also I am using python.

thanks in advance

RBM Weight Update - Python

self.W += lr * ((numpy.dot(self.input.T, ph_sample) - numpy.dot(nv_samples.T, nh_means)))

I think this code should be self.W += lr * ((numpy.dot(self.input.T, ph_sample) - numpy.dot(nv_samples.T, nh_means)) / len(self.input))

license?

do you have a license for this?

dbn.pretraining generates W with nan

I am testing the code using as a X input a matrix of values between -3 to 3, (23 columns) and the pre training is generating W matrix for the RBM layer with nan.

Reading the docs from DeepLearning (http://deeplearning.net/tutorial/rbm.html#rbm) the following: "Note that we also return the pre-sigmoid activation. To understand why this is so you need to understand a bit about how Theano works. Whenever you compile a Theano function, the computational graph that you pass as input gets optimized for speed and stability. This is done by changing several parts of the subgraphs with others. One such optimization expresses terms of the form log(sigmoid(x)) in terms of softplus. We need this optimization for the cross-entropy since sigmoid of numbers larger than 30. (or even less then that) turn to 1. and numbers smaller than -30. turn to 0 which in terms will force theano to compute log(0) and therefore we will get either -inf or NaN as cost. If the value is expressed in terms of softplus we do not get this undesirable behaviour. This optimization usually works fine, but here we have a special case. The sigmoid is applied inside the scan op, while the log is outside. Therefore Theano will only see log(scan(..)) instead of log(sigmoid(..)) and will not apply the wanted optimization. We can not go and replace the sigmoid in scan with something else also, because this only needs to be done on the last step. Therefore the easiest and more efficient way is to get also the pre-sigmoid activation as an output of scan, and apply both the log and sigmoid outside scan such that Theano can catch and optimize the expression"

However within this code the implementation has some differences. Any idea to overcome this issue?

Thank you

ImportError: No module named logistic_sgd

Im trying to use python for SdA,
I dont know where I need to get LogisticRegression.
I have anaconda and Theano but still whenever I write:
from LogisticRegression import LogisticRegression

I receive an error that:


ImportError Traceback (most recent call last)
in ()
----> 1 from logistic_sgd import LogisticRegression, load_data

ImportError: No module named logistic_sgd

RBM Weight Updates Issue

Such Issues are in your RBM different implementation.
[python version]
self.W += lr * (numpy.dot(self.input.T, ph_sample)
- numpy.dot(nv_samples.T, nh_means))
[C++ version]
W[i][j] += lr * (ph_mean[i] * input[j] - nh_means[i] * nv_samples[j]) / N;

For there two versions, the weight update methods are incosistent. And Actually I think the right version should be
self.W += lr * (numpy.dot(self.input.T, ph_means)
- numpy.dot(nv_means.T, nh_means))
Could you please help me confirm such issues? I am not quite sure that whether it's issue or not. And I am just a freshman for deep learning.

Best Regards

About Finetune

I am using scala version of the code, but i dont know how the fine tune of stacked autoencoder is working out. What i can only see is that the logistic regression is made to learn the last layer patter,
please clarify

bug report

Dear Yusuke Sugomori,

In the C version of your deep learning program, I found a minor bug in DBN_predict (Line 193-200 of the DBN.c file). The initialization of " linear_output" should be moved into the k-loop as follows:

for(k=0; k<this->sigmoid_layers[i].n_out; k++) {
  linear_output = 0.0;
  for(j=0; j<this->sigmoid_layers[i].n_in; j++) {
    linear_output += this->sigmoid_layers[i].W[k][j] * prev_layer_input[j];
  }
  linear_output += this->sigmoid_layers[i].b[k];
  layer_input[k] = sigmoid(linear_output);
}

I guess the same problem may appear in the other versions.

Actually, I want to communicate with you for more details about the deep learning algorithm for specifc applications. May I know your email address?

peghoty

Is there this type of deep learning model?

Is there this type of deep learning model?
There are two labeled folders for binary classification.
ex) men and women, cats oand dogs, etc.
And then inserting the images to each folders as training data.
And then just run a simple command to train.
That’s all.
I need these simple training network model. Is there any?

About fine-tuning in the source code of Stacked Denoising Autoencoders

I'm using the Java version of the code. Why the current functionality of the finetune function is used for finetuning of the network? The finetune function does not update the weights of the entire network (except for the last layer, i.e. log_layer).

The other versions of the code written in different languages are using the same trick. Anyone can help me understand why this kind of fine-tuning works? Any references?

Thanks.

DBN_finetune: layer_input used before set

I converted the C code for DBN to Component Pascal and ran its analyzer over the code. It found in DBN_finetune that the variable "layer_input" was used before any values were set for it. Going back to the C code I see that is indeed true. In fact the array is accessed even before it is allocated.

That has got to be a bug.

Here is the C code

void DBN_finetune(DBN* this, int *input, int *label, double lr, int epochs) {
int i, j, m, n, epoch;

int *layer_input;
// int prev_layer_input_size;
int *prev_layer_input;

int *train_X = (int *)malloc(sizeof(int) * this->n_ins);
int *train_Y = (int *)malloc(sizeof(int) * this->n_outs);

for(epoch=0; epoch<epochs; epoch++) {
for(n=0; nN; n++) { // input x1...xN
// initial input
for(m=0; mn_ins; m++) train_X[m] = input[n * this->n_ins + m];
for(m=0; mn_outs; m++) train_Y[m] = label[n * this->n_outs + m];

  // layer input
  for(i=0; i<this->n_layers; i++) {
    if(i == 0) {
      prev_layer_input = (int *)malloc(sizeof(int) * this->n_ins);
      for(j=0; j<this->n_ins; j++) prev_layer_input[j] = train_X[j];
    } else {
      prev_layer_input = (int *)malloc(sizeof(int) * this->hidden_layer_sizes[i-1]);
      for(j=0; j<this->hidden_layer_sizes[i-1]; j++) prev_layer_input[j] = layer_input[j];
      free(layer_input);
    }


    layer_input = (int *)malloc(sizeof(int) * this->hidden_layer_sizes[i]);
    HiddenLayer_sample_h_given_v(&(this->sigmoid_layers[i]), \
                                 prev_layer_input, layer_input);
    free(prev_layer_input);
  }

  LogisticRegression_train(&(this->log_layer), layer_input, train_Y, lr);
}
// lr *= 0.95;

}

free(layer_input);
free(train_X);
free(train_Y);
}

Quension about the updation of weight matrix.

Recently, I am learning your RBM python code. All codes are OK except the updating formula of parameters used in your contrastive_divergence function. I found that the bias vectors are updated using the mean gradient of a batch training examples, while the weight matrix is not such the case. Hence, can kindly you give some explanation about the equation (Line 61-62) in details. I am looking forward your early replay.

maybe a error

RBM.cpp
for(int i=0; i<n_hidden; i++)
{
for(int j=0; j<n_visible; j++)
{
W[i][j] += lr * (ph_mean[i] * input[j] - nh_means[i] * nv_samples[j]) / N;
}

// hbias[i] += lr * (ph_sample[i] - nh_means[i]) / N;    //maybe error
hbias[i] += lr * (ph_mean[i] - nh_means[i]) / N;          // correct

}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.