keskarnitish / large-batch-training Goto Github PK

Code to reproduce some of the figures in the paper "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"

License: MIT License

Python 100.00%

batch-training deep-learning keras minima theano

large-batch-training's People

Contributors

Stargazers

Watchers

large-batch-training's Issues

computing sharpness of a minima

hi Nitish,
Can you release the code of computing sharpness? I want to use the metric in my paper.

Is there a Caffe implementation?

Thanks!

UnboundLocalError: local variable 'out' referenced before assignment

I run python plot_parametric_plot.py -n C1, and get following error:

Traceback (most recent call last):
  File "plot_parametric_plot.py", line 64, in <module>
    model = network_zoo.shallownet(nb_classes)
  File "/home//github/users/wenwei202/large-batch-training/network_zoo.py", line 37, in shallownet
    model.add(BatchNormalization(mode=2,axis=1))
  File "/home//anaconda2/lib/python2.7/site-packages/Keras-1.0.0-py2.7.egg/keras/models.py", line 139, in add
    output_tensor = layer(self.outputs[0])
  File "/home//anaconda2/lib/python2.7/site-packages/Keras-1.0.0-py2.7.egg/keras/engine/topology.py", line 485, in __call__
    self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
  File "/home//anaconda2/lib/python2.7/site-packages/Keras-1.0.0-py2.7.egg/keras/engine/topology.py", line 543, in add_inbound_node
    Node.create_node(self, inbound_layers, node_indices, tensor_indices)
  File "/home//anaconda2/lib/python2.7/site-packages/Keras-1.0.0-py2.7.egg/keras/engine/topology.py", line 148, in create_node
    output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
  File "/home//anaconda2/lib/python2.7/site-packages/Keras-1.0.0-py2.7.egg/keras/layers/normalization.py", line 118, in call
    return out
UnboundLocalError: local variable 'out' referenced before assignment

Keras version: 1
Tensorflow: '1.4.0'

Code for sharpness

Hi Nitish,

Could you include the code for computing sharpness as well.

Thanks,
Neelesh

How much is small and how much is large in real problem?

Hi @keskarnitish, I would like to ask a question that how much is a small batch and how much is a large batch in the real problems. For example, in the object detection, segmentation and pose estimation, we even set mini-batch as 2. Sometimes we also set the minibatch as 32. So how to choose batch number in these problems. Thanks.

Problems when running `plot_parametric_plot.py`

It seems like keras has changed the parameter of the function BatchNormalization. The error msg:

Traceback (most recent call last):
  File "plot_parametric_plot.py", line 64, in <module>
    model = network_zoo.shallownet(nb_classes)
  File "/home/nqluo/experiement/large-batch-training-master/network_zoo.py", line 37, in shallownet
    model.add(BatchNormalization(mode=2,axis=1))
  File "/home/nqluo/anaconda3/envs/tf14-gpu/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 34, in wrapper
    args, kwargs, converted = preprocessor(args, kwargs)
  File "/home/nqluo/anaconda3/envs/tf14-gpu/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 451, in batchnorm_args_preprocessor
    raise TypeError('The `mode` argument of `BatchNormalization` '
TypeError: The `mode` argument of `BatchNormalization` no longer exists. `mode=1` and `mode=2` are no longer supported.

pytorch gpu

Is there any problem that makes the implementation of GPU version difficult? I tried to get a linear combination of SB weights and LB weights in GPU mode, and got weird issues. Did you have similar problems before?

keskarnitish / large-batch-training Goto Github PK

large-batch-training's People

Contributors

Stargazers

Watchers

Forkers

large-batch-training's Issues

computing sharpness of a minima

Is there a Caffe implementation?

UnboundLocalError: local variable 'out' referenced before assignment

Code for sharpness

How much is small and how much is large in real problem?

Problems when running `plot_parametric_plot.py`

pytorch gpu

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent