yusugomori / deeplearning Goto Github PK

View Code? Open in Web Editor NEW

3.1K 3.1K 1.4K 140 KB

Deep Learning (Python, C, C++, Java, Scala, Go)

Home Page: https://yusugomori.com

License: MIT License

C 20.42% C++ 17.03% Java 20.51% Python 15.59% Scala 12.25% Go 14.19%

deep-learning

deeplearning's People

Contributors

Stargazers

Watchers

Forkers

ohsaworks dxgod newsky tosozaki yiiwood ywl ericxsun alienfeel shinexunju seawavet xjzhou changguanghua wingoo ywdong jinbochen xpowerlord baojie leiucky huaijin-chen pengjia meo-meo loull521 ballacky13 yingmin-li irwenqiang 0rchard kelvin-lei lengerfulluse honglongwu silva6 weicongsun ithacadream leigaosearch fateiswar ziyoudefeng kyle2050 etrigger wode-liu zwqjsj0404 bleachyin niuzhao coderjiav xmzhao treper effyroth maincoder ramitseng goodwenxi wool cqu guowt fandywang haohonglin jameswei shen72 daidong yiyinianhua charnugagoo pipifuyj wufeng02 shelocks jeppe ggookkee xuyuan-qd qiany paulbo kennydreame suncj vhaoli mnora issyuwenchao liyanghua idy1000 jackode dreamfrog jeanru watsoncui sandomingo ttang235 orangelpai zsauce wygchina coporlock zhongyingqun xphan weishandong ideabook sigmaquan springbarley tongming imvman justinhel sherlockxlg avaziyi tmacmilan ibillxia jennerdai jiaxiangzheng breakjiang weichaozju

deeplearning's Issues

About Finetune

I am using scala version of the code, but i dont know how the fine tune of stacked autoencoder is working out. What i can only see is that the logistic regression is made to learn the last layer patter,
please clarify

Over fitting in Stacked Denoising Autoencoders

In the finetune phase, the whole dataset is used which will cause over fitting for the model...
So, I think this can be solved by:

divide the dataset into training, test & validation sets
Or, using a k-fold cross validation

license?

do you have a license for this?

maybe a error

RBM.cpp
for(int i=0; i<n_hidden; i++)
{
for(int j=0; j<n_visible; j++)
{
W[i][j] += lr * (ph_mean[i] * input[j] - nh_means[i] * nv_samples[j]) / N;
}

// hbias[i] += lr * (ph_sample[i] - nh_means[i]) / N;    //maybe error
hbias[i] += lr * (ph_mean[i] - nh_means[i]) / N;          // correct

}

How to evaluate the DBN (Scala) model?

How to evaluate the DBN (Scala) model? -i.e. I don't see any way to compute performance metrics such as accuracy, precision, recall, f1 measure etc.

Could I use this in my Android project?

I want to create an Android application using your library, how could I do this? Would you give me some help? Thank you.

RBM Weight Updates Issue

Such Issues are in your RBM different implementation.
[python version]
self.W += lr * (numpy.dot(self.input.T, ph_sample)
- numpy.dot(nv_samples.T, nh_means))
[C++ version]
W[i][j] += lr * (ph_mean[i] * input[j] - nh_means[i] * nv_samples[j]) / N;

For there two versions, the weight update methods are incosistent. And Actually I think the right version should be
self.W += lr * (numpy.dot(self.input.T, ph_means)
- numpy.dot(nv_means.T, nh_means))
Could you please help me confirm such issues? I am not quite sure that whether it's issue or not. And I am just a freshman for deep learning.

Best Regards

why not add makefile for c,cpp for ubuntu(or other linux)?

just A suggestion

Hope CDBN is coming soon!

what about the license?

bug in DBN.cpp

DBN::DBN(int size, int n_i, int *hls, int n_o, int n_l) {
// construct rbm_layer
// DBN-RBM deconstructer bug fixed
rbm_layers[i] = new RBM(N, input_size, hidden_layer_sizes[i],
NULL, NULL, NULL);
//rbm_layers[i] = new RBM(N, input_size, hidden_layer_sizes[i],
// sigmoid_layers[i]->W, sigmoid_layers[i]->b, NULL);
}
}

sigmoid_layers[i]->W, sigmoid_layers[i]->b should be set NULL in DBN constructer
if not, RBM's W member will be sigmoid_layers[i]->W
when RBM deconstruction, there will have issue.

Debug mode of DBN C++ code leads to equal results

While changing from debug mode to release in Visual Studio makes the correct result. So where is the problem?

Can I save the model from training?

can I save the model of DBN?
can I save it using pickle?

dbn.pretraining generates W with nan

I am testing the code using as a X input a matrix of values between -3 to 3, (23 columns) and the pre training is generating W matrix for the RBM layer with nan.

Reading the docs from DeepLearning (http://deeplearning.net/tutorial/rbm.html#rbm) the following: "Note that we also return the pre-sigmoid activation. To understand why this is so you need to understand a bit about how Theano works. Whenever you compile a Theano function, the computational graph that you pass as input gets optimized for speed and stability. This is done by changing several parts of the subgraphs with others. One such optimization expresses terms of the form log(sigmoid(x)) in terms of softplus. We need this optimization for the cross-entropy since sigmoid of numbers larger than 30. (or even less then that) turn to 1. and numbers smaller than -30. turn to 0 which in terms will force theano to compute log(0) and therefore we will get either -inf or NaN as cost. If the value is expressed in terms of softplus we do not get this undesirable behaviour. This optimization usually works fine, but here we have a special case. The sigmoid is applied inside the scan op, while the log is outside. Therefore Theano will only see log(scan(..)) instead of log(sigmoid(..)) and will not apply the wanted optimization. We can not go and replace the sigmoid in scan with something else also, because this only needs to be done on the last step. Therefore the easiest and more efficient way is to get also the pre-sigmoid activation as an output of scan, and apply both the log and sigmoid outside scan such that Theano can catch and optimize the expression"

However within this code the implementation has some differences. Any idea to overcome this issue?

Thank you

DBN_finetune: layer_input used before set

I converted the C code for DBN to Component Pascal and ran its analyzer over the code. It found in DBN_finetune that the variable "layer_input" was used before any values were set for it. Going back to the C code I see that is indeed true. In fact the array is accessed even before it is allocated.

That has got to be a bug.

Here is the C code

void DBN_finetune(DBN* this, int *input, int *label, double lr, int epochs) {
int i, j, m, n, epoch;

int *layer_input;
// int prev_layer_input_size;
int *prev_layer_input;

int *train_X = (int *)malloc(sizeof(int) * this->n_ins);
int *train_Y = (int *)malloc(sizeof(int) * this->n_outs);

for(epoch=0; epoch<epochs; epoch++) {
for(n=0; nN; n++) { // input x1...xN
// initial input
for(m=0; mn_ins; m++) train_X[m] = input[n * this->n_ins + m];
for(m=0; mn_outs; m++) train_Y[m] = label[n * this->n_outs + m];

  // layer input
  for(i=0; i<this->n_layers; i++) {
    if(i == 0) {
      prev_layer_input = (int *)malloc(sizeof(int) * this->n_ins);
      for(j=0; j<this->n_ins; j++) prev_layer_input[j] = train_X[j];
    } else {
      prev_layer_input = (int *)malloc(sizeof(int) * this->hidden_layer_sizes[i-1]);
      for(j=0; j<this->hidden_layer_sizes[i-1]; j++) prev_layer_input[j] = layer_input[j];
      free(layer_input);
    }


    layer_input = (int *)malloc(sizeof(int) * this->hidden_layer_sizes[i]);
    HiddenLayer_sample_h_given_v(&(this->sigmoid_layers[i]), \
                                 prev_layer_input, layer_input);
    free(prev_layer_input);
  }

  LogisticRegression_train(&(this->log_layer), layer_input, train_Y, lr);
}
// lr *= 0.95;

}

free(layer_input);
free(train_X);
free(train_Y);
}

bug report

Dear Yusuke Sugomori,

In the C version of your deep learning program, I found a minor bug in DBN_predict (Line 193-200 of the DBN.c file). The initialization of " linear_output" should be moved into the k-loop as follows:

for(k=0; k<this->sigmoid_layers[i].n_out; k++) {
  linear_output = 0.0;
  for(j=0; j<this->sigmoid_layers[i].n_in; j++) {
    linear_output += this->sigmoid_layers[i].W[k][j] * prev_layer_input[j];
  }
  linear_output += this->sigmoid_layers[i].b[k];
  layer_input[k] = sigmoid(linear_output);
}

I guess the same problem may appear in the other versions.

Actually, I want to communicate with you for more details about the deep learning algorithm for specifc applications. May I know your email address?

peghoty

Quension about the updation of weight matrix.

Recently, I am learning your RBM python code. All codes are OK except the updating formula of parameters used in your contrastive_divergence function. I found that the bias vectors are updated using the mean gradient of a batch training examples, while the weight matrix is not such the case. Hence, can kindly you give some explanation about the equation (Line 61-62) in details. I am looking forward your early replay.

Is there this type of deep learning model?

Is there this type of deep learning model?
There are two labeled folders for binary classification.
ex) men and women, cats oand dogs, etc.
And then inserting the images to each folders as training data.
And then just run a simple command to train.
That’s all.
I need these simple training network model. Is there any?

A Problem with the Deep Belief Network C++ codes

I applied your dbn.cpp code . But for your all test data, the prediction results are equal. It is an unexpected situation (if we give a training sample as a train sample it can not find the true class, too). What is the problem? I expect your help.

ImportError: No module named logistic_sgd

Im trying to use python for SdA,
I dont know where I need to get LogisticRegression.
I have anaconda and Theano but still whenever I write:
from LogisticRegression import LogisticRegression

I receive an error that:

ImportError Traceback (most recent call last)
in ()
----> 1 from logistic_sgd import LogisticRegression, load_data

ImportError: No module named logistic_sgd

About The SDA weights update problem : Parameter update formulas can I use Gauss-Newton method, specifically how to achieve with python,thabk you

RBM Weight Update - Python

self.W += lr * ((numpy.dot(self.input.T, ph_sample) - numpy.dot(nv_samples.T, nh_means)))

I think this code should be self.W += lr * ((numpy.dot(self.input.T, ph_sample) - numpy.dot(nv_samples.T, nh_means)) / len(self.input))

About fine-tuning in the source code of Stacked Denoising Autoencoders

I'm using the Java version of the code. Why the current functionality of the finetune function is used for finetuning of the network? The finetune function does not update the weights of the entire network (except for the last layer, i.e. log_layer).

The other versions of the code written in different languages are using the same trick. Anyone can help me understand why this kind of fine-tuning works? Any references?

Thanks.

CDBN, Dataset with size of 8000 run into failure

Hi,
I am trying to feed in a dataset with about 8000 instances (8391x8), but it gets an exception in LogisticRegression.train function computing d_y = self.y - p_y_given_x;
the error is
ValueError: operands could not be broadcast together with shapes (8391,2) (8,2)
I suppose using CDBN/CRBM classes we can do a prediction like regression but I don't know why the p_y_given_x size is not matching self.y which suppose to do.
Also I am using python.

thanks in advance