Git Product home page Git Product logo

keras's People

Contributors

chenmoneygithub avatar divyashreepathihalli avatar edersantana avatar faisal-alsrheed avatar farizrahman4u avatar fchollet avatar frightera avatar gabrieldemarmiesse avatar grasskin avatar haifeng-jin avatar hertschuh avatar james77777778 avatar mattdangerw avatar maxpumperla avatar nkovela1 avatar nzw0301 avatar old-school-kid avatar ozabluda avatar pavithrasv avatar phreeza avatar qlzh727 avatar rchao avatar sachinprasadhs avatar sampathweb avatar samuelmarks avatar soumik12345 avatar taehoonlee avatar tensorflower-gardener avatar the-moliver avatar wxs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

keras's Issues

Setting up tests

One of our goals for the v1 release is to have full unit test coverage. Let's discuss tests!

We want tests to be:

  • modular (for maintainability); essentially each module should have an independent test file, with independent test functions for each feature of the module.
  • fast. It should take a few seconds to test the entirety of the library. Otherwise tests would probably not be run often enough, or would result in a significant waste of time, which is very contrary to the Keras philosophy.

What are some best practices that you know of for unit-testing a ML library? I am not a big fan of the way tests are handled in Torch7 (one large file concatenating all test functions).

Fix batch normalization during test time

This discussion was done in #79 but since that issue is closed I figured it would make sense to open another issue. We need to fix the batch normalization layer such that:

  1. It can measure the mean and variance of the batch activations of each batch it sees and store it, and
  2. Use that information instead of the mean and variance of the current batch during testing.

Regarding 1, I think maybe it's not good to measure the activation statistics during training because they will be changing over time. Maybe a safer way is to wait until training is over, and then measure these over a single epoch, with all network parameters static.

early stopping

Does Keras support early stopping right now? I have tried to implement this feature by myself but hope to know if the library supports this functionality from underlay?

Initiate a ToDo List

Hey, I'm very interested in contributing to the project. Can you share a list of things to be done, probably a roadmap?

Which version of Python is used?

It seems to be Python2 (I found print without (โ€ฆ)), but I couldn't find any evidence confirming it nor in README.md nor in setup.py for instance.

Model serialization

This discussion started in #51, but as things can get complicated I decided to start another issue.

It seems to be a good idea to store weights for a model separately in an HDF5 file (or even a Numpy npy file, but HDF5 would be more portable). I wanted to compare how large is a serialized model with and without the weights, so I did the following test:

    model = Sequential()
    model.add(Dense(n_input, 2048, init='uniform', activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(2048, 2048, init='uniform', activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(2048, 2048, init='uniform', activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(2048, 2048, init='uniform', activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(2048, n_output, init='uniform', activation='linear'))

(the model is intentionally large!)

I then compiled the model, serialized it after compilation, and removed the weights and post-compilation Theano objects/functions as follows:

    for l in model.layers:
        weights.append(l.get_weights())
        l.params = []
        try:
            l.W = None
            l.b = None
        except ValueError:
            pass
    model.X = None
    model.y = None
    model.y_test = None
    model.y_train = None
    model._train  = None
    model._train_with_acc  = None
    model._test = None
    model._test_with_acc = None
    model._predict = None

The full compiled model ends up with 243 MB, and the cleaned-up model with 120 MB (which is exacly the same we would get from pickling the non-compiled models with the weight matrices deleted). Is there anything else we could remove to make the serialized model smaller?

Requirements for 1Dconvolution

What exactly is required for providing the 1Dconvolution and 1Dpooling layers? Aren't both special cases of 2Dconvolution and 2Dpooling?

Accessing internal states

Hey guys, cool project. The theano interface itself was really horrific and off-putting.

Maybe I'm doing it wrong but is there any way to access the activations of different layers? Similar to predict but only computed half way. Would be really useful for analysis and the likes.

cifar10.py - imports cPickle error

Line 6:

 import six.moves.cPickle

The following code changes fixed the issue for me:
Code change at Line 6:

  from six.moves import cPickle

and at Line 20:

  d = cPickle.load(f)

pooling size > stride

Hey,

Last I checked, theano did not support max-pooling op with size > stride. For example:

MaxPooling2D(poolsize=(3, 3), poolstride=(2,2))

Does keras support it using the cudnn backend?

btw, great work guys!

SimpleRNN Error

The SimpleRNN layer seem to require a tensor3 input of shape (batch_size, time_steps, features). However, Keras as a whole seems to expect only matrix inputs/outputs.

model = Sequential()
#model.add(Dense(20, 5, init='uniform', activation='tanh'))
model.add(SimpleRNN(5, 20, activation='sigmoid'))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mse', optimizer='sgd')
/usr/local/lib/python2.7/dist-packages/keras/layers/recurrent.pyc in output(self, train)
     47         X = self.get_input(train) # shape: (nb_samples, time (padded with zeros at the end), input_dim)
     48         # new shape: (time, nb_samples, input_dim) -> because theano.scan iterates over main dimension
---> 49         X = X.dimshuffle((1,0,2))
     50 
     51         x = T.dot(X, self.W) + self.b

/usr/local/lib/python2.7/dist-packages/theano/tensor/var.pyc in dimshuffle(self, *pattern)
    332             pattern = pattern[0]
    333         op = theano.tensor.basic.DimShuffle(list(self.type.broadcastable),
--> 334                                             pattern)
    335         return op(self)
    336 

/usr/local/lib/python2.7/dist-packages/theano/tensor/elemwise.pyc in __init__(self, input_broadcastable, new_order, inplace)
    139                     raise ValueError(("new_order[%d] is %d, but the input "
    140                         "only has %d axes.") %
--> 141                         (i, j, len(input_broadcastable)))
    142                 if j in new_order[(i + 1):]:
    143                     raise ValueError((

ValueError: new_order[2] is 2, but the input only has 2 axes.```

preprocessing utils would greatly benefit from sklearn

These preprocessing utils would greatly benefit from a fast Cython rewrite.

Preprocessing utils would greatly benefit from sklearn.feature_extraction.text, no?
Or do you want to keep dependencies low and have more fine-grained vectorization?

Recurrent Models with sequences of mixed length

The training process for LSTM only supports tensor3. If the sequences are of different length, then X must be a list, however models.py:90 does not support lists as input. I think a quick fix would be to cast X_batch to tensor3 if batch_size=1, and also fix y_batch accordingly.

General questions

Not an Issue per se but it is good to see a Theano-based deep learning library as Theano can be pretty difficult to understand when all is needed is plug-and-play functionality. Are there plans to support word2vec and sentence2vec? Anything else planned?

Add interrupt handlers

Since many experiments can take a while and will be running on an environment where the user does not have a lot of control (e.g., a shared cluster), it would be interesting to have interrupt handlers to do something in case the operating system sends a signal to kill the process during the execution of the fit method. Blocks does this by using the signal module (which is part of the standard library). That way, you can save the current model state (using pickle, for example) before letting the OS kill your process. Would you be interested in adding something similar to the fit method in model.py?

How to save and load model?

Hello,
Thank you for this module, it looks awesome.
I am using the cifar10_cnn example. Is there any efficient way to save and load the trained network?

I tried to use cPickle on model but I hit the "Maximum recursion depth" error...

Fix in cifar example

Hi,

I had an error on the cifar10 example.
Traceback (most recent call last):
File "cifar10_cnn.py", line 49, in
model.add(Flatten(64_8_8))
TypeError: init() takes exactly 1 argument (2 given)

You should remove the argument for the flatten layer, and it works !

Adding Batch Size as explicit parameter for Batch Normalization layer

I think it would be much more clear and easy to have a "batch size" parameter separately for Batch Normalization layer. We can just directly pass the outputs of our convolution or pooling layers to it. The layers as a whole will be more coherent.
(As an aside, did anyone have any luck with batch normalization? I tried many times, but actually got worse results most of the time.)

How to get the output of Conv layer and FC layer?

Hi, fchollet

I just spend several hours reading the documentation, looking through the example cifar10_cnn.py, and I find it really easy to use keras. But as shown in the title, I am confused by these two question:

  • how to get the output of the convolution layer? I want to visualize the feature map after each convolution layer. Although it is not so important, I need this when writing paper. Any other methods of this framework?
  • I want to use CNN as feature extractor, so the output of the fully connected layer should be saved. It seems that keras do not support?

Thanks!

standardize_y does not support using alternative classes as datasets

I implemented a class to be able to use slices of an HDF5 dataset as data matrices/vectors in keras. Even though the class emulates the ndarray API (at least for len and __getitem__ stuff), you can't call np.asarray on it. Since I am not sure on how this could be fixed, I preferred to post this as an issue to ask for advice. Maybe we could change it such that it only calls np.asarray if it makes sense (i.e., its input is a list, list of tuples, tuple, tuple of tuples, tuple of lists or ndarray?

The class in question is in the following gist: https://gist.github.com/jfsantos/14ae9631716a2aa328c4.

Issues loading sub-modules

First of, great work, I've been looking for something like this for a while now, powerful yet simple to use.

I am trying to import the keras from outside the module. I added the parent folder to my PYTHONPATH variable, but when I run the scripts, i'm getting errors loading the modules below the root:

e.g.
/Users/simon.hughes/GitHub/keras/activations.py in ()
24 return x
25
---> 26 from utils.generic_utils import get_from_module
27 def get(identifier):
28 return get_from_module(identifier, globals(), 'activation function')

ImportError: No module named generic_utils

I've tried adding some of the subfolders to the python path:

sys.path.insert(0, "/Users/simon.hughes/GitHub")
sys.path.insert(0, "/Users/simon.hughes/GitHub/keras")
sys.path.insert(0, "/Users/simon.hughes/GitHub/keras/utils")

But that prevents loading modules like utils.generic_utils. It sees generic_utils as a module, but not utils.generic_utils.

Would it be possible to create a setup.py script to install an egg file? Or is there something simple I can do to make this work? The only way i can run code successfully is from the examples folder.

Working with large datasets like Imagenet

Hi Guys,

First and foremost, I think Keras is quite amazing !!

So far, I see that the largest dataset has about 50000 images. I was wondering if it is possible to work on Imagenet scale datasets (around 1,000,000 images, which are too big to fit in memory), by pre-processing the data (i.e., splitting it into say : 1000 containers of 1000 images each), and feeding one container at a time to the model.fit() function. Or, do I have to save_weights() and load_weights() after each container ?

Thanks for reading.

no pip yet?

tried to install keras with pip install keras, but what I got looks like this:

Collecting keras
  Could not find a version that satisfies the requirement keras (from versions: )
  No matching distribution found for keras`

Reconfiguring a model after training

Certain layers, such as Dropout and BatchNormalization, are supposed to be used only during training. It would be nice if we could disable them during testing or for using a model in production. This could be done by changing the connections between layers or replacing a layer by an Identity layer. Any ideas?

Problem with return_sequences=True

Hey guys so I'm trying to feed a bunch of 2d (batches,seq_len) indices of text sequences into the model in an attempt to try to predict the next word. This leads to a 3d output of (batches,seq_len,vocab_size) time distributed softmax.

model = Sequential()
model.add(Embedding(vocabsize,128)) 
model.add(GRU(128, 128, return_sequences=True))
model.add(Reshape(seqlen,128))
model.add(Dense(128,vocabsize))
model.add(Reshape(seqlen,vocabsize))
model.add(Activation('time_distributed_softmax'))

So far so good. Running the model.predict_probas(sequences) yields a (batches,seq_len,vocab_size) output matrix. Problem is that doing the model.fit(sequences, bin_sequences, batch_size= 4) gives me the theano error of:

('Bad input argument to theano function with name "build/bdist.linux-x86_64/egg/keras/models.py:66" at index 1(0-based)', 'Wrong number of dimensions: expected 2, got 3 with shape (4, 16, 15423).')

Even though the model does output a 3d array, theano still expects a 2d array. Or am I overlooking something?
Is there any way to deal with multiple multi-dimensional time sequences then?

Activation penalties

I am just starting to explore kera and if I understand the layout, it seems like penalty/constraints are not really abstracted to the extent that other concepts are. Is there some obvious reason this would not work or be dangerous?

For example, I could imagine applying generic penalties to either weights or the activations. Like a sparsity inducing KL penalty that I typically want to apply to activations. If it was fully abstracted, I could try to apply it to the weights of some layer. This would be strange but it seems like it would maximize modularity and separation of concepts.

It seems like PR77 #77 is moving toward a kind of specialized penalty and there is already an L2/L1 penalty in the optimizers.

Batch Embedding

Not sure of this is how it's to work or not (seems to be a Theano issue) but Embedding layers do not work in batch mode. They work fine as first layers in recurrent net builds or for batch size =1 in pure feedforward ones but setting the batch size to 16 with an embedding layer included yields the following error:

ValueError: Input dimension mis-match. (input[0].shape[1] = 1, input[2].shape[1] = 16)
Apply node that caused the error: Elemwise{Composite{((i0 + i1) - i2)}}[(0, 0)](Reshape{3}.0, InplaceDimShuffle{x,x,0}.0, InplaceDimShuffle{x,0,1}.0)
Inputs types: [TensorType(float64, 3D), TensorType(float64, (True, True, False)), TensorType(float64, (True, False, False))]
Inputs shapes: [(16, 1, 1), (1, 1, 1), (1, 16, 1)]
Inputs strides: [(8, 8, 8), (8, 8, 8), (128, 8, 8)]
Inputs values: ['not shown', array([[[ 0.]]]), 'not shown']

importing package fails

when I try to run the code by using GPU, I get following error

from keras.models import Sequential
  File "build/bdist.linux-x86_64/egg/keras/models.py", line 8, in <module>
  File "build/bdist.linux-x86_64/egg/keras/objectives.py", line 5, in <module>
ImportError: cannot import name range

It works fine on CPU version

LSTM - Sequences with different num of time steps

Hi,

Could you explain how this library is handling sequences with different number of time steps? Specifically - can we have sequences with different number of time steps and if so where one can supply the length of the sequence?

Thank you!

Autoencoder Architechture

Given the discussion of weight initializations, any opinions on how an autoencoder architecture should be added?

-Stopping conditions for each in a pre-created set of layers [vs.] manual saving -> layer addition -> compile -> start next level of training?
-Noise addition (perhaps as a layer) over a distribution?
-Used only as a pretraining device [vs.] allow backproagation to create and encoder + decoder?

Move regularizers to layer definitions?

Hello,
Great job with keras! I wanted to see what you thought about this before I began hacking on it since it would involve some breaking changes.
It seems to me that the regularizers, i.e. maxnorm, L1 and L2, would be more flexible if they were incorporated into the layer definitions, so that different regularization and/or constraints could be applied at each layer if desired. The reason I bring this up is that I wanted to add a non-negativity constraint at a particular layer but there didn't seem to be a straight-forward way to do so.
Let me know any thoughts.
Best,
Mike

Multiple sequences

Hey guys,

Is this thing actually supported?:
" Eats inputs with shape:
(nb_samples, max_sample_length (samples shorter than this are padded with zeros at the end), input_dim)
"
The recurrent models only take (input_length, input_dim) sized inputs. Perhaps change the comments to remove this part of the description.

To deal with multiple sequences currently implies merging them into one big sequence and padding them to alignment and I see no other way around it.

l1, l2 regularization

I tried to use l1 and l2 regularization in the optimizer (Adam), but the optimization seems to be the same as without using regularization.

Create a setup.py

Awesome library. I've been looking for something with an LSTM that's this simple for some time. I can only seem to run the scripts from within the keras folder. I added the location for the keras directory, downloaded from git, to my sys.path and I can't import the keras module.

Model training diverges after some level ?

Here my training output with Sequential model. As you can see model diverges after epoch 10. Any ideas about the reason?

Epoch 0
61878/61878 [==============================] - 5s - loss: 1.1788
Epoch 1
61878/61878 [==============================] - 5s - loss: 1.0403
Epoch 2
61878/61878 [==============================] - 5s - loss: 0.9919
Epoch 3
61878/61878 [==============================] - 5s - loss: 0.9397
Epoch 4
61878/61878 [==============================] - 5s - loss: 0.8915
Epoch 5
61878/61878 [==============================] - 4s - loss: 0.8484
Epoch 6
61878/61878 [==============================] - 5s - loss: 0.8145
Epoch 7
61878/61878 [==============================] - 5s - loss: 0.7909
Epoch 8
61878/61878 [==============================] - 4s - loss: 0.7627
Epoch 9
61878/61878 [==============================] - 5s - loss: 0.7407
Epoch 10
61878/61878 [==============================] - 6s - loss: 13.3614
Epoch 11
61878/61878 [==============================] - 5s - loss: 26.6396
Epoch 12
61878/61878 [==============================] - 5s - loss: 26.6453
Epoch 13
61878/61878 [==============================] - 5s - loss: 26.6462
Epoch 14
61878/61878 [==============================] - 5s - loss: 26.6461
Epoch 15
61878/61878 [==============================] - 5s - loss: 26.6470
Epoch 16
61878/61878 [==============================] - 5s - loss: 26.6468
Epoch 17
61878/61878 [==============================] - 5s - loss: 26.6465
Epoch 18
61878/61878 [==============================] - 3s - loss: 26.6468
Epoch 19
61878/61878 [==============================] - 3s - loss: 26.6469

Connecting one layer with two other layers

Hi,

Can one create a layer with this library that is connected to two other layers and not only to one?

For example - one can apply a conv and then max pooling on an image and call this layer 1 and then apply only a conv on the original image and call this layer 2. Now we can create a fully connected layer that will be connected to layer1 and layer2. Therefore the network is not linear and can be any kind of a directed acyclic graph.

Thank you!

New datasets and application examples

We're very interested in adding new datasets and new example scripts.

If you've used Keras to do something neat with open data, we would love to check it out, and possibly include your script or/and add support for the dataset.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.