dbln / stochastic_depth_keras Goto Github PK

View Code? Open in Web Editor NEW

139.0 139.0 28.0 17 KB

Keras implementation for "Deep Networks with Stochastic Depth" http://arxiv.org/abs/1603.09382

License: MIT License

Python 100.00%

stochastic_depth_keras's People

Contributors

Stargazers

Watchers

stochastic_depth_keras's Issues

Maximum recursion depth exceeded in cmp

Hi, I got a RuntimeError: maximum recursion depth exceeded in cmp when running it on a virtualenv. It seems related to Theano Issue #689.

I have the following libraries installed on my virtual environment (Keras is installed from keras-1 branch):

Keras==1.0.0
numpy==1.11.0
PyYAML==3.11
scipy==0.17.0
six==1.10.0
Theano==0.8.1

test-time scale according to local death rate

https://github.com/dblN/stochastic_depth_keras/blob/d26c492/train.py#L86-L87

When doing lin_decay, these two lines set the scaling to one minus the maximal value of the death rate. I believe they should be set to one minus the death rate of the current residual block. Am I misuderstanding something?

Dealing with memory limitation

Hi,
Following your advice regarding setting the recursion limit.
I managed to get this to run with N = 17. My windows machine has a 16GB of RAM, using the Theano backend. higher than that, python crashes.

Here are my questions:

Is this a result of the GPU or CPU ram limit?
Is there a way to effectively deal with it, besides getting more ram?
How much RAM do you have on machines that allow reaching N = 50.
Does it matter whether you use tensorflow or the theano backend?

Thank you!

Adding the figure of training results

Incorrect use of BatchNorm

You are using image tensors with the Theano dimension ordering conventions: (samples, channels, width, height). You want to do BatchNorm on the channels, therefore you should use:

BatchNormalization(axis=1)

instead of:

BatchNormalization()  # default for axis is -1

How the gate update , when training?

Thanks a lot for doing this. I might misunderstand something, but l can understand how the gate update when training. I can not understand the two code...

gate = K.variable(1, dtype="uint8")
add_tables += [{"death_rate": _death_rate, "gate": gate}]
return Lambda(lambda tensors: K.switch(gate, tensors[0], tensors[1]),
              output_shape=output_shape)([out, x])

Is this 'gate' always equal 1,when training ?...

class GatesUpdate(Callback):
    def on_batch_begin(self, batch, logs={}):
        open_all_gates()

        rands = np.random.uniform(size=len(add_tables))
        for t, rand in zip(add_tables, rands):
            if rand < K.get_value(t["death_rate"]):
                K.set_value(t["gate"], 0)

Is this 'GatesUpdate' make action on the 'Lambda' layer ,when training?
Thank you.

What's missing?

Thanks a lot for doing this. I might misunderstand something, but the chart seems to say 15% eventual validation error on CIFAR-10. The original yueatsprograms Torch implementation has 5.23% validation error. I believe this difference is too large to be attributed to the different number of resnet blocks, or the lack of augmentation. Is there some missing functionality or known bug responsible for the discrepancy?

dbln / stochastic_depth_keras Goto Github PK

stochastic_depth_keras's People

Contributors

Stargazers

Watchers

Forkers

stochastic_depth_keras's Issues

Maximum recursion depth exceeded in cmp

test-time scale according to local death rate

Dealing with memory limitation

Adding the figure of training results

Incorrect use of BatchNorm

How the gate update , when training?

What's missing?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent