andreacodegoni / tiny_model_4_cd Goto Github PK
View Code? Open in Web Editor NEWOfficial implementation of TinyCD: A (Not So) Deep Learning Model For Change Detection
Official implementation of TinyCD: A (Not So) Deep Learning Model For Change Detection
My own dataset is very imbalanced, and I achieved superior results using:
weight_for_positive_class = 5
pos_weight = torch.tensor([weight_for_positive_class])
self.criterion = torch.nn.BCEWithLogitsLoss(pos_weight=pos_weight)
Hi there,
I'm currently working on a research project that requires the WHU-CD dataset. I noticed that a Dropbox link is provided for this dataset, and I was wondering if you have any other links available, such as a Baiduyun link?
Unfortunately, I am located in a region where Dropbox is not accessible, and I am unable to download the dataset. It would be greatly appreciated if you could provide a Baiduyun link or any alternative means of accessing the dataset.
Thank you for your time.
Hello, I made the following mistakes while training the model:
lib\site-packages\PIL\PngImagePlugin.py", line 712, in _open
raise SyntaxError("not a PNG file")
SyntaxError: not a PNG file
There should be no problem with the data. What is the solution?
Thank you for the repo. I just want to run the pre-trained models on a single pair of images, say i1 and i2 and get the changes. How can I do that?
Currently this command evaluates the whole directory in the dataset.
python3 test_ondata.py --modelpath pretrained_models/levir_best.pth
hello! I have read your paper, I am interested in this. But I can not find the code about generate the attention mask (res 256\128\64), so I want to How to generate the attention mask and save them?
Thanks!
Hello, in your paper, you mentioned that the model training was conducted on the NVIDIA GeForce RTX 2060 6GB GPU. However, when I attempted to run your source code on an NVIDIA GeForce RTX 4060Ti 8GB GPU, I encountered an "out of memory" issue. Even when I reduced the batch_size to 1, I still had to resort to mixed-precision training.
And in the scenario where batch_size is set to 1 and mixed-precision training is applied, the issue of non-convergence in the loss function has emerged.
When I tried to install requirements through conda..many libraries were not present in any of conda channels..so I tried installing in pip
Even then many libs were mismatched
Python -3.8.10
`ERROR: Could not find a version that satisfies the requirement antlr-python-runtime==4.9.3 (from versions: none)
ERROR: No matching distribution found for antlr-python-runtime==4.9.3
ERROR: Could not find a version that satisfies the requirement blas==1.0 (from versions: none)
ERROR: No matching distribution found for blas==1.0
ERROR: Could not find a version that satisfies the requirement bzip2==1.0.8 (from versions: none)
ERROR: No matching distribution found for bzip2==1.0.8
`
Am I missing any other specifications,or are these versions or library names to be changed?
Hi,
After executing the the code with WHU-CD dataset with pretrained model,these are the results of the test
{'acc': 0.9484738175586964, 'miou': 0.522093921610696, 'mf1': 0.5742922406664711, 'iou_0': 0.9481903331958307, 'iou_1': 0.0959975100255613, 'F1_0': 0.9734062005987101, 'F1_1': 0.17517828073423214, 'precision_0': 0.9649982845855988, 'precision_1': 0.24004840719627082, 'recall_0': 0.9819620400564614, 'recall_1': 0.13790991201484162}
The recall for change detection seems very low at 0.13,is this the expected value,should something be altered?
the problem info as follows:
Traceback (most recent call last):
File "training.py", line 223, in
run()
File "training.py", line 217, in run
save_after=1,
File "training.py", line 143, in train
training_phase(epc)
File "training.py", line 106, in training_phase
it_loss = evaluate(reference, testimg, mask)
File "training.py", line 84, in evaluate
generated_mask = model(reference, testimg).squeeze(1)
File "/home//anaconda3/envs/py370/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home//work/Tiny_model_4_CD-main/models/change_classifier.py", line 48, in forward
latents = self._decode(features)
File "/home//work/Tiny_model_4_CD-main/models/change_classifier.py", line 65, in _decode
upping = self._up[i](upping, features[j])
File "/home//anaconda3/envs/py370/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home//work/Tiny_model_4_CD-main/models/layers.py", line 121, in forward
x = x * y
RuntimeError: The size of tensor a (64) must match the size of tensor b (256) at non-singleton dimension 3
But there seems to be no infer part in the code, please ask how to infer
Using the supplied training scripts and dataset, I here share the logs of the training run on LEVIR. 100 epochs appears to be well chosen as learning plateaus here.
I trained on a V100 Lightning.ai instance in around 4 hours:
Validation phase summary
Loss for epoch 99 is 0.0209393590421314
IoU class change for epoch 99 is 0.8350830680701221
F1 class change for epoch 99 is 0.9101310212289551
Hi, boss, I want to rewrite this using tensorflow. But there are some problems. The loss can only drop to around 0.34, the output is all zeros. Here is the model code,
import tensorflow as tf
from typing import List
import tensorflow_addons as tfa
from tensorflow.keras.layers import Conv2D, PReLU, UpSampling2D, , Activation
from tensorflow.keras import Sequential, Model
import tensorflow.keras.applications as app
class PixelwiseLinear(Model):
def __init__(
self,
fout: List[int],
last_activation: Model = None,
) -> None:
super().__init__()
n = len(fout)
self._linears = Sequential(
[
Sequential(
[Conv2D(fout[i], kernel_size=1, use_bias=True, kernel_initializer="he_normal"),
PReLU(shared_axes=[0, 1, 2, 3], alpha_initializer=tf.initializers.constant(0.25)) if i < n - 1 or last_activation is None else last_activation
]
)
for i in range(n)
]
)
def call(self, x):
# Processing the tensor:
return self._linears(x)
class MixingBlock(Model):
def __init__(
self,
ch_out: int,
):
super().__init__()
self._convmix = Sequential(
[Conv2D(ch_out, 3, groups=ch_out, padding="SAME", kernel_initializer="he_normal"),
PReLU(shared_axes=[0, 1, 2, 3], alpha_initializer=tf.initializers.constant(0.25)),
tfa.layers.InstanceNormalization(center=False, scale=False, epsilon=1e-5)]
)
def call(self, x, y):
# Packing the tensors and interleaving the channels:
mixed = tf.stack([x, y], axis=1)
mixed = tf.reshape(mixed, (tf.shape(x)[0], tf.shape(x)[1], tf.shape(x)[2], -1))
# Mixing:
return self._convmix(mixed)
class MixingMaskAttentionBlock(Model):
"""use the grouped convolution to make a sort of attention"""
def __init__(
self,
ch_out: int,
fout: List[int],
generate_masked: bool = False,
):
super().__init__()
self._mixing = MixingBlock(ch_out)
self._linear = PixelwiseLinear(fout)
# self._final_normalization = tfa.layers.InstanceNormalization(center=False, scale=False, epsilon=1e-5) if generate_masked else None
# self._mixing_out = MixingBlock(ch_out) if generate_masked else None
def call(self, x, y):
z_mix = self._mixing(x, y)
z = self._linear(z_mix)
return z
# z_mix_out = 0 if self._mixing_out is None else self._mixing_out(x, y)
# return (
# z
# if self._final_normalization is None
# else self._final_normalization(z_mix_out * z)
# )
class UpMask(Model):
def __init__(
self,
up_dimension: int,
nin: int,
nout: int,
):
super().__init__()
self._upsample = UpSampling2D(size=(up_dimension, up_dimension), interpolation="bilinear")
self._convolution = Sequential(
[Conv2D(nin, 3, 1, groups=nin, padding="SAME", kernel_initializer="he_normal"),
PReLU(shared_axes=[0, 1, 2, 3], alpha_initializer=tf.initializers.constant(0.25)),
tfa.layers.InstanceNormalization(center=False, scale=False, epsilon=1e-5),
Conv2D(nout, kernel_size=1, strides=1, kernel_initializer="he_normal"),
PReLU(shared_axes=[0, 1, 2, 3], alpha_initializer=tf.initializers.constant(0.25)),
tfa.layers.InstanceNormalization(center=False, scale=False, epsilon=1e-5)]
)
def call(self, x, y=None):
x = self._upsample(x)
if y is not None:
x = x * y
return self._convolution(x)
class Eb4TinyCd(Model):
def __init__(self):
super().__init__()
efficientb4 = getattr(app, 'EfficientNetB4')(include_top=False)
outputs = [
efficientb4.get_layer(ln).output
for ln in ["stem_activation", "block1b_add", "block2d_add", "block3d_add"]
]
# outputs = [
# efficientb4.get_layer(ln).output
# for ln in ["stem_activation", "block1b_drop", "block2d_drop", "block3d_drop"]
# ]
self._backbones = []
inputs = [efficientb4.get_layer(ln).input for ln in
['input_1', 'block1a_dwconv', 'block2a_expand_conv', 'block3a_expand_conv']]
for inx, inout in enumerate(zip(inputs, outputs)):
inl, out = inout
self._backbones.append(Model(inputs=inl, outputs=out, name=f"backbone_{inx}"))
# Initialize mixing blocks:
self._first_mix = MixingMaskAttentionBlock(3, [10, 5, 1])
self._mixing_mask1 = MixingMaskAttentionBlock(24, [12, 6, 1])
self._mixing_mask2 = MixingMaskAttentionBlock(32, [16, 8, 1])
self._mixing_mask3 = MixingBlock(56)
self._mixing_mask = []
self._mixing_mask.append(self._mixing_mask1)
self._mixing_mask.append(self._mixing_mask2)
self._mixing_mask.append(self._mixing_mask3)
# Initialize Upsampling blocks:
self._up1 = UpMask(2, 56, 64)
self._up2 = UpMask(2, 64, 64)
self._up3 = UpMask(2, 64, 32)
self._up = []
self._up.append(self._up1)
self._up.append(self._up2)
self._up.append(self._up3)
# Final classification layer:
self._classify = PixelwiseLinear([16, 8, 1], Activation(tf.nn.sigmoid))
def call(self, ref, test):
features = self._encode(ref, test)
latents = self._decode(features)
return self._classify(latents)
def _encode(self, ref, test):
features = [self._first_mix(ref, test)]
for num, layer in enumerate(self._backbones):
ref, test = layer(ref), layer(test)
if num != 0:
features.append(self._mixing_mask[num - 1](ref, test))
return features
def _decode(self, features):
upping = features[-1]
for i, j in enumerate(range(-2, -5, -1)):
upping = self._up[i](upping, features[j])
return upping
if __name__ == '__main__':
inputs = tf.random.normal(shape=(1, 256, 256, 3), dtype=tf.float32)
model = Eb4TinyCd()
output = model(inputs, inputs)
weights = model.trainable_weights
for w in weights:
wshape = w.shape
sumtmp = 1
for d in wshape:
sumtmp *= d
print(wshape, ' ', w.name, ' ', sumtmp)
model.summary()
loss function
self.loss = tf.keras.losses.BinaryCrossentropy(from_logits=False, reduction=Reduction.NONE)
Not an issue, but in case anyone else is interested in training on > 3 bands (e.g. sentinel 2) here are the changes:
class ChangeClassifier(Module):
def __init__(
self,
input_channels: int = 3, # new arg
bkbn_name: str = "efficientnet_b4",
pretrained: bool = True,
output_layer_bkbn: str = "3",
freeze_backbone: bool = False,
):
super().__init__()
# Load the pretrained backbone according to parameters:
self._backbone = _get_backbone(
bkbn_name, pretrained, output_layer_bkbn, freeze_backbone, input_channels
)
# Initialize mixing blocks passing input_channels
self._first_mix = MixingMaskAttentionBlock(
input_channels * 2, input_channels, [input_channels, 10, 5], [10, 5, 1]
)
...
# pass input_channels to _get_backbone
def _get_backbone(
bkbn_name, pretrained, output_layer_bkbn, freeze_backbone, input_channels
) -> ModuleList:
# The whole model:
entire_model = getattr(torchvision.models, bkbn_name)(
weights=EfficientNet_B4_Weights.IMAGENET1K_V1 if pretrained else None
).features
# Modify the first conv layer input_channels
first_conv = entire_model[0][0]
first_conv.in_channels = input_channels
new_weight = torch.randn(
first_conv.out_channels, input_channels, *first_conv.kernel_size
)
first_conv.weight.data = new_weight
# Slicing the model
derived_model = ModuleList([])
for name, layer in enumerate(entire_model):
derived_model.append(layer)
if str(name) == output_layer_bkbn:
break
# Freezing the backbone weights
if freeze_backbone:
for param in derived_model.parameters():
param.requires_grad = False
return derived_model
Cheers
I'm attempting to reproduce the paper results on LEVIR-CD+ & I note you trained for 100 epochs and apparently didn't use early stopping. Based on my training run it appears learning plateaus after 30-40 epochs, is this what you saw?
At 100 epochs the metrics are short of those in the paper:
'test_f1': 0.75
'test_iou': 0.60
Possible diff in the dataset used in the paper vs this version?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.