Git Product home page Git Product logo

keras-squeeze-excite-network's Introduction

Squeeze and Excitation Networks in Keras

Implementation of Squeeze and Excitation Networks in Keras 2.0.3+.

squeeze-excite-block

Models

Current models supported :

  • SE-ResNet. Custom ResNets can be built using the SEResNet model builder, whereas prebuilt Resnet models such as SEResNet50, SEResNet101 and SEResNet154 can also be built directly.
  • SE-InceptionV3
  • SE-Inception-ResNet-v2
  • SE-ResNeXt

Additional models (not from the paper, not verified if they improve performance)

  • SE-MobileNets
  • SE-DenseNet - Custom SE-DenseNets can be built using SEDenseNet model builder, whereas prebuilt SEDenseNet models such as SEDenseNetImageNet121, SEDenseNetImageNet169, SEDenseNetImageNet161, SEDenseNetImageNet201 and SEDenseNetImageNet264 can be build DenseNet in ImageNet configuration. To use SEDenseNet in CIFAR mode, use the SEDenseNet model builder.

Squeeze and Excitation block

The block is simple to implement in Keras. It composes of a GlobalAveragePooling2D, 2 Dense blocks and an elementwise multiplication. Shape inference can be done automatically in Keras. It can be imported from se.py.

from tensorflow.keras.layers import GlobalAveragePooling2D, Reshape, Dense, Permute, multiply
import tensorflow.keras.backend as K


def squeeze_excite_block(tensor, ratio=16):
    init = tensor
    channel_axis = 1 if K.image_data_format() == "channels_first" else -1
    filters = init._keras_shape[channel_axis]
    se_shape = (1, 1, filters)

    se = GlobalAveragePooling2D()(init)
    se = Reshape(se_shape)(se)
    se = Dense(filters // ratio, activation='relu', kernel_initializer='he_normal', use_bias=False)(se)
    se = Dense(filters, activation='sigmoid', kernel_initializer='he_normal', use_bias=False)(se)

    if K.image_data_format() == 'channels_first':
        se = Permute((3, 1, 2))(se)

    x = multiply([init, se])
    return x

Addition of Squeeze and Excitation blocks to Inception and ResNet blocks

se-architectures SE-ResNet-architecture

keras-squeeze-excite-network's People

Contributors

samuelmarks avatar titu1994 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

keras-squeeze-excite-network's Issues

Function _tensor_shape is un necessary

Hello, to whom may be concerned:
During coding, I found that function _tensor_shape is useless.
One can just write tensor.shape instead of calling _tensor_shape
The Tensorflow version that I use is 2.7.0

Add a License?

"When you make a creative work (which includes code), the work is under exclusive copyright by default. Unless you include a license that specifies otherwise, nobody else can copy, distribute, or modify your work without being at risk of take-downs, shake-downs, or litigation. Once the work has other contributors (each a copyright holder), “nobody” starts including you. ..."

From: https://choosealicense.com/no-permission/

OOM error

I get this error while using SEResNext as a drop in replacement to Keras resnet50:

2018-12-13 21:39:05.642530: W tensorflow/core/framework/op_kernel.cc:1275] OP_REQUIRES failed at conv_ops.cc:398 : Resource exhausted: OOM when allocating tensor with shape[12,512,512,512] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "rnxt50.py", line 262, in <module>
    callbacks=[checkpointer])
  File "/home/m/.local/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/m/.local/lib/python3.5/site-packages/keras/engine/training.py", line 1418, in fit_generator
    initial_epoch=initial_epoch)
  File "/home/m/.local/lib/python3.5/site-packages/keras/engine/training_generator.py", line 217, in fit_generator
    class_weight=class_weight)
  File "/home/m/.local/lib/python3.5/site-packages/keras/engine/training.py", line 1217, in train_on_batch
    outputs = self.train_function(ins)
  File "/home/m/.local/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/home/m/.local/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/home/m/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1382, in __call__
    run_metadata_ptr)
  File "/home/m/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[12,1024,512,512] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[Node: resneXt50_mask/conv2d_2/convolution = Conv2D[T=DT_FLOAT, _class=["loc:@training/Adam/gradients/AddN_201"], data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resneXt50_mask/activation_1/Relu, conv2d_2/kernel/read)]]

while this is a pretty big input compared to yours (512x512 compared to the 32x32 default at imagenet) it has no problem with the keras implementation of resnet. it looks like there is an extra 512 sized dimension to the vector which cause it to eat all memory

the code i use is:
    pretrain_model_mask = ResNeXt.resnext.ResNext(input_shape = (512,512,3),
        include_top=False,
        weights=None,
        pooling='avg')
    pretrain_model_mask.name='resneXt50_mask'
    x = pretrain_model_mask(inp_mask)
    out = Dense(n_out, activation='sigmoid')(x)
    model = Model(inputs=inp_mask, outputs=[out])

    return model

model = create_model(
    input_shape=(512,512,3),
    n_out=28)

model.compile(
    loss='binary_crossentropy',
    optimizer='adam',
    metrics=['acc', f1])

train_generator = train_datagen.create_train(
    train_dataset_info, batch_size, (512,512,3))
validation_generator = train_datagen.create_train(
    valid_dataset_info, batch_size, (512,512,3))
K.set_value(model.optimizer.lr, 0.0001)
# train model
history = model.fit_generator(
    train_generator,
    steps_per_epoch=len(train_df)//batch_size,
    validation_data=validation_generator,
    validation_steps=len(valid_df)//batch_size//10,
    epochs=epochs,
    verbose=1,
    callbacks=[checkpointer])

any help will be greatly appreciated!

Regards,
Moshe

the networks is inconsistent with hujie-frank/SENet

Thank you for your keras implementation which is very helpfull.
But, I found several inconsistencies.
For example, the first conv layer should do stride (2, 2) conv.
For example, the residual block should have THREE rather than TWO conv layer.

I am getting Nonetype Error (Import issue?)

Traceback (most recent call last):
File "se_resnet.py", line 408, in
model = SEResNet50()
File "se_resnet.py", line 213, in SEResNet50
classes=classes)
File "se_resnet.py", line 138, in SEResNet
filters, depth, width, bottleneck, weight_decay, pooling)
File "se_resnet.py", line 376, in _create_se_resnet
x = _resnet_bottleneck_block(x, filters[0], width)
File "se_resnet.py", line 330, in _resnet_bottleneck_block
x = squeeze_excite_block(x)
File "/home/centos/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/keras_squeeze_excite_network-0.0.4-py3.6.egg/keras_squeeze_excite_network/se.py", line 26, in squeeze_excite_block
filters = _tensor_shape(init)[channel_axis]
TypeError: 'NoneType' object is not subscriptable

I don't understand why this is not working..

Error in : se.py, 23: x = multiply([init, se])"

I got a problem when I run "x = multiply([init, se])" in se.py of line 23.
Could you tell me how to solve it ? thanks.

Traceback (most recent call last):
File "", line 1, in
File "/home/b3432/anaconda2/envs/Keras/lib/python2.7/site-packages/keras/layers/merge.py", line 468, in multiply
return Multiply(**kwargs)(inputs)
File "/home/b3432/anaconda2/envs/Keras/lib/python2.7/site-packages/keras/engine/topology.py", line 571, in call
self.build(input_shapes)
File "/home/b3432/anaconda2/envs/Keras/lib/python2.7/site-packages/keras/layers/merge.py", line 84, in build
output_shape = self._compute_elemwise_op_output_shape(output_shape, shape)
File "/home/b3432/anaconda2/envs/Keras/lib/python2.7/site-packages/keras/layers/merge.py", line 55, in _compute_elemwise_op_output_shape
str(shape1) + ' ' + str(shape2))
ValueError: Operands could not be broadcast together with shapes (64, 48, 48) (64, 1, 64)

SEInceptionResNetV2 is not working, encountered two errors

I tried to use the function SEInceptionResNetV2() within model.py as shown below, and the import was successful but when called raises a Type Error: () got an unexpected keyword argument 'scale' . The error traced the problem to be caused by this particular Lambda function call within the inception_resnet_block(), as shown below.

I then hypothesized maybe its related to my local installation of tensorflow being 2.2.0, so I proceeded to run model.py within Google Colab, which has tensorflow version == 2.3.0. But then I encountered a different error, it is still a Type Error but with details: 'NoneType' object is not subscriptable, as shown below.

Inside model.py local Machine

from keras_squeeze_excite_network.se_inception_resnet_v2 import SEInceptionResNetV2

model = SEInceptionResNetV2()

Error encountered in local:

File "model.py", line 3, in <module>
    model = SEInceptionResNetV2()
  File "C:\Users\acer\Anaconda3\python_scripts\keras-squeeze-excite-network\keras_squeeze_excite_network\se_inception_resnet_v2.py", line 280, in SEInceptionResNetV2
    block_idx=block_idx)
  File "C:\Users\acer\Anaconda3\python_scripts\keras-squeeze-excite-network\keras_squeeze_excite_network\se_inception_resnet_v2.py", line 161, in inception_resnet_block
    name=block_name)([x, up])
  File "C:\Users\acer\Anaconda3\envs\Resnet\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 922, in __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
  File "C:\Users\acer\Anaconda3\envs\Resnet\lib\site-packages\tensorflow\python\keras\layers\core.py", line 888, in call
    result = self.function(inputs, **kwargs)
TypeError: <lambda>() got an unexpected keyword argument 'scale'

Within inception_resnet_block that caused the error:

x = Lambda(lambda inputs, scale_: inputs[0] + inputs[1] * scale_,
               output_shape=K.int_shape(x)[1:],
               arguments={'scale': scale},
               name=block_name)([x, up])

Inside model.py in Google Colab:

from keras_squeeze_excite_network.se_inception_resnet_v2 import SEInceptionResNetV2

model = SEInceptionResNetV2()

Error encountered in Colab:

TypeError                                 Traceback (most recent call last)
<ipython-input-4-858f80a24425> in <module>()
----> 1 model = SEInceptionResNetV2()

1 frames
/content/drive/My Drive/test/keras_squeeze_excite_network/se.py in squeeze_excite_block(input_tensor, ratio)
     24     init = input_tensor
     25     channel_axis = 1 if K.image_data_format() == "channels_first" else -1
---> 26     filters = _tensor_shape(init)[channel_axis]
     27     se_shape = (1, 1, filters)
     28 

TypeError: 'NoneType' object is not subscriptable

And I want to point out that this is the only model that raised an error, the other models worked for both 2.3.0 and 2.2.0 versions of tensorflow.

A trivial typo of spatial SE block

Hello, I just found a tiny typo in the spatial_squeeze_excite_block function of se.py.
It should be:
se = Conv2D(1, (1, 1), activation='sigmoid', use_bias=False, kernel_initializer='he_normal')(input)

(input) should be append in the end of the above code.
I know it is trivial. hope not to confuse anyone who want to call it. Thanks.

Is weights='imagenet' option currently available?

I am a little confused regarding the weights argument in SEResNet().

According to the source code of se_resnet.py, weights='imagenet' is an option; but I cannot see any weights actually loaded in the code (there is a comment, #load weights in line 143, but there is no code after it.

Should I guess that the subject option is there only for future proofing, i.e. for when pre-trained weights will be available for loading, and is not currently functional, or I am missing something and I can indeed use a pretrained SEResNet() with ImageNet weights?

Many thanks in advance

Element wise multiplication isn't working

Hello, the element wise multiplication (model=multiply([init, se])) throws this error. Should the dimensions of se and init be always the same?

ValueError: Only tensors of same shape can be merged by layermultiply_1 Got input shapes: [(None, 128, 128, 64), (None, 1, 1, 64)]

No Bias in Dense Layers

Could not find this in paper. Is there a reason we don't use a bias in the Dense layers within the SE block?

How to combine TimeDistributed with SENet?

Thank you for sharing the code. There is a question that may need your generous help. If I want to input images of different timesteps to the CNNs with SE , i.e. the input shape may be (batch_size, timesteps, height, width, channel), in which the weights of one layer are identical, then the wrapper 'TimeDistributed' is a good choice. So could you give the code which may be used in this occassion? I found an implementation as follows, but it seems doesn't work.

`class SeBlock(keras.layers.Layer):   
    def __init__(self, reduction=4,**kwargs):
        super(SeBlock,self).__init__(**kwargs)
        self.reduction = reduction
    def build(self,input_shape):
    	#input_shape     
    	pass
    def call(self, inputs):
        x = keras.layers.GlobalAveragePooling2D()(inputs)
        x = keras.layers.Dense(int(x.shape[-1]) // self.reduction, use_bias=False,activation=keras.activations.relu)(x)
        x = keras.layers.Dense(int(inputs.shape[-1]), use_bias=False,activation=keras.activations.hard_sigmoid)(x)
        return keras.layers.Multiply()([inputs,x])  `

Thank You!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.