Git Product home page Git Product logo

tiny_model_4_cd's People

Contributors

andreacodegoni avatar kevintherainmaker avatar likyoo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

tiny_model_4_cd's Issues

Improvement with weighted loss

My own dataset is very imbalanced, and I achieved superior results using:

        weight_for_positive_class = 5
        pos_weight = torch.tensor([weight_for_positive_class])
        self.criterion = torch.nn.BCEWithLogitsLoss(pos_weight=pos_weight)

Indoor Change Detections?

While Tiny_CD is designed for remote sensing cases, a question that I have is does it work for indoor change detection?
For instance, I want to detect the changes between these two images (door opened). Does it work for such cases?

d1
d2

Request for WHU-CD dataset Baiduyun link

Hi there,

I'm currently working on a research project that requires the WHU-CD dataset. I noticed that a Dropbox link is provided for this dataset, and I was wondering if you have any other links available, such as a Baiduyun link?

Unfortunately, I am located in a region where Dropbox is not accessible, and I am unable to download the dataset. It would be greatly appreciated if you could provide a Baiduyun link or any alternative means of accessing the dataset.

Thank you for your time.

SyntaxError: not a PNG file

Hello, I made the following mistakes while training the model:

lib\site-packages\PIL\PngImagePlugin.py", line 712, in _open
raise SyntaxError("not a PNG file")
SyntaxError: not a PNG file

There should be no problem with the data. What is the solution?

How to evaluate the model on a single image pair?

Thank you for the repo. I just want to run the pre-trained models on a single pair of images, say i1 and i2 and get the changes. How can I do that?

Currently this command evaluates the whole directory in the dataset.
python3 test_ondata.py --modelpath pretrained_models/levir_best.pth

How to generate the attention mask and save them

hello! I have read your paper, I am interested in this. But I can not find the code about generate the attention mask (res 256\128\64), so I want to How to generate the attention mask and save them?
Thanks!

out of memory

Hello, in your paper, you mentioned that the model training was conducted on the NVIDIA GeForce RTX 2060 6GB GPU. However, when I attempted to run your source code on an NVIDIA GeForce RTX 4060Ti 8GB GPU, I encountered an "out of memory" issue. Even when I reduced the batch_size to 1, I still had to resort to mixed-precision training.
And in the scenario where batch_size is set to 1 and mixed-precision training is applied, the issue of non-convergence in the loss function has emerged.

Pip install fails for many libs

When I tried to install requirements through conda..many libraries were not present in any of conda channels..so I tried installing in pip
Even then many libs were mismatched
Python -3.8.10
`ERROR: Could not find a version that satisfies the requirement antlr-python-runtime==4.9.3 (from versions: none)
ERROR: No matching distribution found for antlr-python-runtime==4.9.3

ERROR: Could not find a version that satisfies the requirement blas==1.0 (from versions: none)
ERROR: No matching distribution found for blas==1.0

ERROR: Could not find a version that satisfies the requirement bzip2==1.0.8 (from versions: none)
ERROR: No matching distribution found for bzip2==1.0.8

`
Am I missing any other specifications,or are these versions or library names to be changed?

Recall for detected change(class 1) is very low

Hi,
After executing the the code with WHU-CD dataset with pretrained model,these are the results of the test
{'acc': 0.9484738175586964, 'miou': 0.522093921610696, 'mf1': 0.5742922406664711, 'iou_0': 0.9481903331958307, 'iou_1': 0.0959975100255613, 'F1_0': 0.9734062005987101, 'F1_1': 0.17517828073423214, 'precision_0': 0.9649982845855988, 'precision_1': 0.24004840719627082, 'recall_0': 0.9819620400564614, 'recall_1': 0.13790991201484162}

The recall for change detection seems very low at 0.13,is this the expected value,should something be altered?

RuntimeError: The size of tensor a (64) must match the size of tensor b (256) at non-singleton dimension 3.

the problem info as follows:
Traceback (most recent call last):
File "training.py", line 223, in
run()
File "training.py", line 217, in run
save_after=1,
File "training.py", line 143, in train
training_phase(epc)
File "training.py", line 106, in training_phase
it_loss = evaluate(reference, testimg, mask)
File "training.py", line 84, in evaluate
generated_mask = model(reference, testimg).squeeze(1)
File "/home//anaconda3/envs/py370/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home//work/Tiny_model_4_CD-main/models/change_classifier.py", line 48, in forward
latents = self._decode(features)
File "/home//work/Tiny_model_4_CD-main/models/change_classifier.py", line 65, in _decode
upping = self._up[i](upping, features[j])
File "/home//anaconda3/envs/py370/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home//work/Tiny_model_4_CD-main/models/layers.py", line 121, in forward
x = x * y
RuntimeError: The size of tensor a (64) must match the size of tensor b (256) at non-singleton dimension 3

Reproducing results on LEVIR and logs

Using the supplied training scripts and dataset, I here share the logs of the training run on LEVIR. 100 epochs appears to be well chosen as learning plateaus here.

image

I trained on a V100 Lightning.ai instance in around 4 hours:

Validation phase summary
Loss for epoch 99 is 0.0209393590421314
IoU class change for epoch 99 is 0.8350830680701221
F1 class change for epoch 99 is 0.9101310212289551

not issue, just for help

Hi, boss, I want to rewrite this using tensorflow. But there are some problems. The loss can only drop to around 0.34, the output is all zeros. Here is the model code,

import tensorflow as tf
from typing import List
import tensorflow_addons as tfa
from tensorflow.keras.layers import Conv2D, PReLU, UpSampling2D, , Activation
from tensorflow.keras import Sequential, Model
import tensorflow.keras.applications as app


class PixelwiseLinear(Model):
    def __init__(
        self,
        fout: List[int],
        last_activation: Model = None,
    ) -> None:
        super().__init__()
        n = len(fout)
        self._linears = Sequential(
            [
                Sequential(
                    [Conv2D(fout[i], kernel_size=1, use_bias=True, kernel_initializer="he_normal"),
                    PReLU(shared_axes=[0, 1, 2, 3], alpha_initializer=tf.initializers.constant(0.25)) if i < n - 1 or last_activation is None else last_activation
                     ]
                )
                for i in range(n)
            ]
        )

    def call(self, x):
        # Processing the tensor:
        return self._linears(x)


class MixingBlock(Model):
    def __init__(
        self,
        ch_out: int,
    ):
        super().__init__()
        self._convmix = Sequential(
            [Conv2D(ch_out, 3, groups=ch_out, padding="SAME", kernel_initializer="he_normal"),
            PReLU(shared_axes=[0, 1, 2, 3], alpha_initializer=tf.initializers.constant(0.25)),
            tfa.layers.InstanceNormalization(center=False, scale=False, epsilon=1e-5)]
        )

    def call(self, x, y):
        # Packing the tensors and interleaving the channels:

        mixed = tf.stack([x, y], axis=1)
        mixed = tf.reshape(mixed, (tf.shape(x)[0], tf.shape(x)[1], tf.shape(x)[2], -1))

        # Mixing:
        return self._convmix(mixed)


class MixingMaskAttentionBlock(Model):
    """use the grouped convolution to make a sort of attention"""

    def __init__(
        self,
        ch_out: int,
        fout: List[int],
        generate_masked: bool = False,
    ):
        super().__init__()
        self._mixing = MixingBlock(ch_out)
        self._linear = PixelwiseLinear(fout)
        # self._final_normalization = tfa.layers.InstanceNormalization(center=False, scale=False, epsilon=1e-5) if generate_masked else None
        # self._mixing_out = MixingBlock(ch_out) if generate_masked else None

    def call(self, x, y):
        z_mix = self._mixing(x, y)
        z = self._linear(z_mix)
        return z
        # z_mix_out = 0 if self._mixing_out is None else self._mixing_out(x, y)

        # return (
        #     z
        #     if self._final_normalization is None
        #     else self._final_normalization(z_mix_out * z)
        # )


class UpMask(Model):
    def __init__(
        self,
        up_dimension: int,
        nin: int,
        nout: int,
    ):
        super().__init__()
        self._upsample = UpSampling2D(size=(up_dimension, up_dimension), interpolation="bilinear")
        self._convolution = Sequential(
            [Conv2D(nin, 3, 1, groups=nin, padding="SAME", kernel_initializer="he_normal"),
            PReLU(shared_axes=[0, 1, 2, 3], alpha_initializer=tf.initializers.constant(0.25)),
            tfa.layers.InstanceNormalization(center=False, scale=False, epsilon=1e-5),
            Conv2D(nout, kernel_size=1, strides=1, kernel_initializer="he_normal"),
            PReLU(shared_axes=[0, 1, 2, 3], alpha_initializer=tf.initializers.constant(0.25)),
            tfa.layers.InstanceNormalization(center=False, scale=False, epsilon=1e-5)]
        )

    def call(self, x, y=None):
        x = self._upsample(x)
        if y is not None:
            x = x * y
        return self._convolution(x)


class Eb4TinyCd(Model):
    def __init__(self):
        super().__init__()

        efficientb4 = getattr(app, 'EfficientNetB4')(include_top=False)
        outputs = [
            efficientb4.get_layer(ln).output
            for ln in ["stem_activation", "block1b_add", "block2d_add", "block3d_add"]
        ]
        # outputs = [
        #     efficientb4.get_layer(ln).output
        #     for ln in ["stem_activation", "block1b_drop", "block2d_drop", "block3d_drop"]
        # ]
        self._backbones = []
        inputs = [efficientb4.get_layer(ln).input for ln in
                  ['input_1', 'block1a_dwconv', 'block2a_expand_conv', 'block3a_expand_conv']]
        for inx, inout in enumerate(zip(inputs, outputs)):
            inl, out = inout
            self._backbones.append(Model(inputs=inl, outputs=out, name=f"backbone_{inx}"))

        # Initialize mixing blocks:
        self._first_mix = MixingMaskAttentionBlock(3, [10, 5, 1])
        self._mixing_mask1 = MixingMaskAttentionBlock(24, [12, 6, 1])
        self._mixing_mask2 = MixingMaskAttentionBlock(32, [16, 8, 1])
        self._mixing_mask3 = MixingBlock(56)
        self._mixing_mask = []
        self._mixing_mask.append(self._mixing_mask1)
        self._mixing_mask.append(self._mixing_mask2)
        self._mixing_mask.append(self._mixing_mask3)
        # Initialize Upsampling blocks:
        self._up1 = UpMask(2, 56, 64)
        self._up2 = UpMask(2, 64, 64)
        self._up3 = UpMask(2, 64, 32)
        self._up = []
        self._up.append(self._up1)
        self._up.append(self._up2)
        self._up.append(self._up3)

        # Final classification layer:
        self._classify = PixelwiseLinear([16, 8, 1], Activation(tf.nn.sigmoid))

    def call(self, ref, test):
        features = self._encode(ref, test)
        latents = self._decode(features)
        return self._classify(latents)

    def _encode(self, ref, test):
        features = [self._first_mix(ref, test)]
        for num, layer in enumerate(self._backbones):
            ref, test = layer(ref), layer(test)
            if num != 0:
                features.append(self._mixing_mask[num - 1](ref, test))
        return features

    def _decode(self, features):
        upping = features[-1]
        for i, j in enumerate(range(-2, -5, -1)):
            upping = self._up[i](upping, features[j])
        return upping


if __name__ == '__main__':
    inputs = tf.random.normal(shape=(1, 256, 256, 3), dtype=tf.float32)
    model = Eb4TinyCd()
    output = model(inputs, inputs)
    weights = model.trainable_weights
    for w in weights:
        wshape = w.shape
        sumtmp = 1
        for d in wshape:
            sumtmp *= d
        print(wshape, '  ',  w.name, '  ', sumtmp)

    model.summary()

loss function

self.loss = tf.keras.losses.BinaryCrossentropy(from_logits=False, reduction=Reduction.NONE)

Update for multichannel input

Not an issue, but in case anyone else is interested in training on > 3 bands (e.g. sentinel 2) here are the changes:

class ChangeClassifier(Module):
    def __init__(
        self,
        input_channels: int = 3, # new arg
        bkbn_name: str = "efficientnet_b4",
        pretrained: bool = True,
        output_layer_bkbn: str = "3",
        freeze_backbone: bool = False,
    ):
        super().__init__()

        # Load the pretrained backbone according to parameters:
        self._backbone = _get_backbone(
            bkbn_name, pretrained, output_layer_bkbn, freeze_backbone, input_channels
        )

        # Initialize mixing blocks passing input_channels
        self._first_mix = MixingMaskAttentionBlock(
            input_channels * 2, input_channels, [input_channels, 10, 5], [10, 5, 1]
        )

...

# pass input_channels to _get_backbone
def _get_backbone(
    bkbn_name, pretrained, output_layer_bkbn, freeze_backbone, input_channels
) -> ModuleList:
    # The whole model:
    entire_model = getattr(torchvision.models, bkbn_name)(
        weights=EfficientNet_B4_Weights.IMAGENET1K_V1 if pretrained else None
    ).features

    # Modify the first conv layer input_channels
    first_conv = entire_model[0][0]
    first_conv.in_channels = input_channels

    new_weight = torch.randn(
        first_conv.out_channels, input_channels, *first_conv.kernel_size
    )
    first_conv.weight.data = new_weight

    # Slicing the model
    derived_model = ModuleList([])
    for name, layer in enumerate(entire_model):
        derived_model.append(layer)
        if str(name) == output_layer_bkbn:
            break

    # Freezing the backbone weights
    if freeze_backbone:
        for param in derived_model.parameters():
            param.requires_grad = False

    return derived_model

Cheers

Reproducing results on LEVIR-CD+

I'm attempting to reproduce the paper results on LEVIR-CD+ & I note you trained for 100 epochs and apparently didn't use early stopping. Based on my training run it appears learning plateaus after 30-40 epochs, is this what you saw?

image image

At 100 epochs the metrics are short of those in the paper:

 'test_f1': 0.75
 'test_iou': 0.60

Possible diff in the dataset used in the paper vs this version?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.