Comments (15)
@dnouri I am sorry if I didn't describe my question clearly. Let me try again. Suppose I use the example code you provided above, that is :
net = nolearn.lasagne.NeuralNet(..., eval_size = 0.2, verbose = 1)
Xt, Xv, yt, yv = net.train_test_split(X, y, net.eval_size)
net.fit(Xt, yt)
My question is that when the net is looping over training examples(i.e. Xt), it will print the training loss, validation loss and validation accuracy since verbose is set 1. Will this validation accuracy be calculated by Xv, yv? Or in another way: the net.fit( ) call will again separate another 20% of Xt as validation data and use this to calculate the accuracy. In such scenario, the actual training data size is X.shape[0] * 80% * 80%? Please clarify me. Thank you .
from nolearn.
@dnouri Also, what if I would like to use a validation dataset generated previously by myself instead of separating from X_train?
from nolearn.
The way that the train_test_split
method is set up, you can just call it again and get the same split. So to get the loss on the training set after training finished, you could do something like this:
Xt, Xv, yt, yv = net.train_test_split(X, y, net.eval_size)
yp = net.predict_proba(Xt)
Please reopen if you think there's an issue, or if I've misunderstood. Note it's also possible to override with your own train_test_split
entirely, to do whatever you want.
from nolearn.
Also, I'm noticing that the issue title seemingly has nothing to do with the issue that you describe. What's going on?
from nolearn.
The validation loss will be calculated with the same Xv and yv.
from nolearn.
@dnouri , then here comes my confusion. Why don't just explicitly pass Xt, Xv, yt, yv to fit() function ?
Oh, I see. May be we should keep the same style as sklearn pipeline provides ?
from nolearn.
Yes, same style as sklearn.
from nolearn.
In that case, you might want to sublass NeuralNet
and implement your own train_test_split
.
from nolearn.
@dnouri I see. I have also noticed that you have updated the train_test_split()
function in the Github. But the pip install nolearn version keeps the same. In the Github version, there is a check for eval_size
. As I understand, we can pass eval_size = None
in the Github version. And in such case, the validation accuracy will be calculated actually by X_train ?
from nolearn.
There will be no meaningful validation accuracy checks then. To use the Git version, download from Github and run python setup.py develop
.
from nolearn.
@dnouri No meaningful! No wonder that I pass eval_size = None
and it prints out Nan. Also thanks for your tips of using Git version.
from nolearn.
@dnouri Dainel. There is one question that obsessed me for days. Inspired by your posted Kaggle Facial Detection Blog, I participated in the Kaggle MNIST competition and get 99.4% accuracy in submission by using the same CNN architecture as you suggested in the blog. While I am trying to classify a TinnyImageNet dataset with 200 classes. Still I am using the same network, and I define a subclass named myNeuralNet
by inheriting nolearn.lasagne.NeuralNet by a slight modification based on our previous discussion. That is, when eval_size = None
we will pass X_val
and y_val
to myNeuralNet
so that I can set my own validation dataset in the train_test_split
function. Then here comes my confusion. I just use 500 data examples(X_train
) for training, and deliberately set X_val = X_train
. In such case, I am expecting overfitting these 500 training examples after a few epochs. If everything goes right, it will print a higher and higher validation accuracy overing looping epochs. However this is what I get.
In case the image collapses, the url is: http://imgur.com/mYlrS8i
My snippet code for defining CNN looks like this:
conv = myNeuralNet(
layers = [
('input', layers.InputLayer),
('conv1', Conv2DLayer),
('pool1', MaxPool2DLayer),
('dropout1', layers.DropoutLayer),
('conv2', Conv2DLayer),
('pool2', MaxPool2DLayer),
('dropout2', layers.DropoutLayer),
('conv3', Conv2DLayer),
('pool3', MaxPool2DLayer),
('dropout3', layers.DropoutLayer),
('hidden4', layers.DenseLayer),
('dropout4', layers.DropoutLayer),
('hidden5', layers.DenseLayer),
('output', layers.DenseLayer),
],
# layer parameters:
input_shape = (None, 3, 64, 64), # 64x64 input pixels per batch
conv1_num_filters = 32, conv1_filter_size = (3, 3), pool1_ds = (2, 2), dropout1_p = 0.1,
conv2_num_filters = 64, conv2_filter_size=(2, 2), pool2_ds=(2, 2), dropout2_p = 0.2,
conv3_num_filters = 128, conv3_filter_size = (2, 2), pool3_ds = (2, 2), dropout3_p = 0.3,
hidden4_num_units = 500, dropout4_p = 0.5,
hidden5_num_units = 500,
output_num_units = 200, # 200 labels
conv1_nonlinearity = rectify, conv2_nonlinearity = rectify, conv3_nonlinearity = rectify,
hidden4_nonlinearity = rectify, hidden5_nonlinearity = rectify,
output_nonlinearity = softmax, # output layer uses softmax function
# optimization method:
#update = nesterov_momentum,
update = rmsprop,
update_learning_rate = 0.001,
#update_momentum = 0.9,
eval_size = None,
X_valid = X_train,
y_valid = y_train,
on_epoch_finished = [early_stopping],
on_training_finished = [early_stopping.load_best_weights],
batch_iterator_train = BatchIterator(batch_size = 20),
batch_iterator_test = BatchIterator(batch_size = 20),
max_epochs = MAX_EPOCHS, # we want to train this many epochs
verbose = 1,
)
conv.fit(X_val, y_val)
So, do you see anything wrong or any suggestions are welcomed. Thank you.
from nolearn.
What is your
on_training_finished = [early_stopping.load_best_weights],
doing? I assume you are just loading the best weights after all the training epochs are finished. Can you post the code for that, maybe it is doing something weird.
from nolearn.
@msegala Thank you for your reply and trying to help. Actually I am using the early_stopping code provided by @cancan101 in previous issues. And here is the code:
class EarlyStopping(object):
"""
Early stopping strategy which is used to prevent from overfitting the training data.
"""
def __init__(self, patience = 100):
self.patience = patience
self.best_valid = np.inf
self.best_valid_epoch = 0
self.best_valid_accuracy = 0
self.best_weights = None
def __call__(self, nn, train_history):
current_valid = train_history[-1]['valid_loss']
current_epoch = train_history[-1]['epoch']
current_valid_accuracy = train_history[-1]['valid_accuracy']
# Use validation loss validation or accuracy
if current_valid_accuracy > self.best_valid_accuracy:
#if current_valid < self.best_valid:
self.best_valid = current_valid
self.best_valid_epoch = current_epoch
self.best_valid_accuracy = current_valid_accuracy
self.best_weights = [w.get_value() for w in nn.get_all_params()]
elif self.best_valid_epoch + self.patience < current_epoch:
if nn.verbose:
print("Early stopping.")
print("Best valid loss was {:.6f} at epoch {} with accuracy {}.".format(
self.best_valid, self.best_valid_epoch, self.best_valid_accuracy))
nn.load_weights_from(self.best_weights)
if nn.verbose:
print("Weights set.")
raise StopIteration()
def load_best_weights(self, nn, train_history):
print("Training stage finishes. Best valid loss was {:.6f} at epoch {} with accuracy {}.".format(
self.best_valid, self.best_valid_epoch, self.best_valid_accuracy))
nn.load_weights_from(self.best_weights)
from nolearn.
@dnouri Danniel, still waiting for your advice and suggestions.
from nolearn.
Related Issues (20)
- RememberBestWeights does not honor the verbose parameter HOT 2
- A replayable fit() method - diff/patch attached HOT 1
- remove('trainable') Lasagne's command doesn't work in nolearn HOT 6
- flip_filters and pad parameter not used by NeuralNet's class HOT 5
- OSError: could not read bytes when trying to fetch mldata HOT 2
- CUDA error, possibly related to network size? HOT 2
- Trained on GPU, inference on CPU doesn't make sense
- Install nolearn with Lasagne dependance not working HOT 2
- Bug in calculating average scores
- nolearn is not installing
- Bug when using Lasagne `mask_input` parameter
- 'NeuralNet' object has no attribute 'layers_' HOT 1
- Weights sum up to zero
- Future issue with sklearn.cross_validation
- Dependency on both backends in requirements.txt switches off GPU support HOT 3
- Enable to reproduce the last value of trainning when predicting CNN
- enable to reproduce loss value of training when predicting CNN HOT 1
- python 3 support not working with Lasagne? HOT 12
- TypeError: Failed to instantiate <class 'lasagne.layers.pool.MaxPool2DLayer'> with args {'name': 'pool1', 'ds': (2, 2), 'incoming': <lasagne.layers.conv.Conv2DLayer object at 0x7ff765fa29e8>}. Maybe parameter names have changed?
- nolearn now on conda-forge HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nolearn.