cc-ai / climategan Goto Github PK

Code and pre-trained model for the algorithm generating visualisations of 3 climate change related events: floods, wildfires and smog.

Home Page: https://thisclimatedoesnotexist.com

License: GNU General Public License v3.0

Python 1.33% Shell 0.01% Jupyter Notebook 98.67%

climate-change computer-vision deep-learning domain-adaptation generative-adversarial-network pytorch

climategan's People

Contributors

Stargazers

Watchers

Forkers

tianyu-z fagan2888 alexhernandezgarcia saulocatharino ziwangdeng g-ampo nimaboscarino zhihao-lin planetaryintelligence climatepals codecollab-go fraware jameschirambo saketspradhan manhhodinh betanyc betanyc

climategan's Issues

infer masks and load them

for use in semantic and depth matching losses

Evaluate simclr performance

So here's the issue: we need to have a good way to figure out whether pretraining the encoder with simclr is better than the pretrained Deeplabv2. Many options are out there...

I would begin by just comparing the masker's results of the actual BaseEncoder pretrained with simclr and the Deeplabv2 encoder pretrained on Cityscapes. If the results are better with simclr, then it's clear that we should use it and we would have a lighter encoder. If the results are worse, then we could:
Have the same number of parameters for both encoders (BaseEncoder pretrained with simclr and Deeplabv2 pretrained on Cityscapes), which would imply adding layers to BaseEncoder, train the masker with both and compare the results. If the results are similar or better with simclr, then we should use it with the BaseEncoder. If the results are worse, then we could:
Pretrain Deeplabv2 using simclr and our data (do we pretrain it from scratch or from the pretrained one on Cityscapes?), and then compare the masker's results with the pretrained Deeplabv2 on Cityscapes only. If the results are better with simclr, then we keep Deeplabv2 but we add the simclr pretraining.

And to compare the masker's results, what metrics should we use? And do we train the masker while freezing the encoder or fine-tuning it?

So, what's your thoughts? :)

Discriminators are updated twice

As per #18

ADVENT use target segmentation?

Currently our ADVENT implementation uses target labels from deeplab v2 as GT to train the "segmentation" (ie masker for us). This is not how they do it in the paper so we should check w/ experiments what's best

Careful: Sim Masks are not really binary

Encoded w/ 3 channels with noisy numbers (for ex. 0.1 instead of 0, 254 instead of 255 etc.)

mask = mask > 255

clean up classifier

move blocks defined there to blocks.py or use those already existing there like Conv2dBlock

Film vs SPADE?

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization: https://arxiv.org/abs/1703.06868
FiLM: Visual Reasoning with a General Conditioning Layer: https://arxiv.org/abs/1709.07871
Semantic Image Synthesis with Spatially-Adaptive Normalization: https://arxiv.org/abs/1903.07291

Leverage pairs

With data from the simulated world, we can "mode-collapse" the translator with paired data:

loss = l2(x_aba, x_a) + l2(x_ab, x_b) # x_b is x_a's real transformation, its pair

Write tests on the refactored code

Empty config file being saved to output_dir

Currently, the opts file that's written to the output directory is empty. This could be a yaml safe_dump issue, or something else.

Cross-entropy isn't the right loss for domain adaptation

@gcosne You mean to change the loss for the updates of G coming from C in update_g and the updates of C alone in update_c ?

No adaptation head: discriminate on features

Instead of using A to structure the latent space and make it domain-invariant (in terms of domain adaptation) we can use an output discriminator after S and D for instance to make those representations indiscernable real | sim

Test / fix resume function

test and check that trainer.resume()

Parametrize BaseDecoder's norm

Conv2dBlock has norm="layer" but should probably be a parameter, defaulting to spectral

Add feature matching loss to painter

conditioning may fail in the early stages: train auto-encoding T after some task performance

At first, do not train T during the representation phase as it would be conditioned on tasks which are not performant yet. Or use ground truth when available

segmentation loss for advent

Are we sure we're using all the losses from advent? the seg_main and aux lambdas are not used anywhere

Image logging error in depth head

In this PR
There are some issues in logging depth images from the real world (image is repeated) and then there's also a slight issue with colors in the mask images that are logged in comet.
(see this experiment )

How to infer from cropped data

Hey,

when comparing input data and inferred data (for instance l2 reconstruction real <-> cycle_recon) there's a mismatch in dims:

input: resized to 256, random crop to 224
inferred: 256

How does MUNIT deal with this @adrienju ? output of network is 224?

Handling mismatches in data list "tasks" and config "tasks

Refactor: update deprecations

Update everything to latest, replace deprecated calls

[classifier] RuntimeError: The size of tensor a (19) must match the size of tensor b (2) at non-singleton dimension 1

Running test_trainer.py I get

Traceback (most recent call last):
  File "test_trainer.py", line 77, in <module>
    trainer.update_g(domain_batch)
  File "../omnigan/trainer.py", line 269, in update_g
    r_loss = self.get_representation_loss(multi_domain_batch)
  File "../omnigan/trainer.py", line 334, in get_representation_loss
    prediction, update_target
  File "../omnigan/trainer.py", line 128, in <lambda>
    self.losses["G"]["tasks"]["s"] = lambda x, y: (x + y).mean()
RuntimeError: The size of tensor a (19) must match the size of tensor b (2) at non-singleton dimension 1

Implementation choices

2 things to keep in mind, which might need some refactoring is:

it is setup for domain translation, which we might need to reconsider
it is setup for pretraining of reprensentations before that of translations

Handling mismatches in data list "tasks" and config "tasks"

Currently the code attempts to decode all tasks that are in the data lists (regardless of whether they appear in the config).

See line 471 in "omnigan/trainer.py":

remove all references to `a` (adaptation)

The idea for a was to do sim to real domain adaptation but I don't think we'll do that anymore so we could just remove it. Thoughts?

Two data loaders for synthetic images

Currently the code is setup so that there's a separate data loader for synthetic-flooded and synthetic-nonflooded images. As a result, the loaded pairs don't correspond to one another.

Is it worth preserving this structure (and ensuring that the two loaders are shuffling in the same way) or just have one dataloader for the synthetic data?

The fact that we're moving away from "image translation" makes me think the latter is fine. What do you think?

Self-supervised surrogate tasks

Self-Supervised Learning of Pretext-Invariant Representations PIRL

https://arxiv.org/abs/1912.01991

Some images have weird colors

When doing inference, occasionally the generated image has a weird color scheme (maybe the channels are shuffled).

Example:

Add Classifier

Create real classifier architecture in classifier.py
Add / adapt losses in loss.py
Add loss computations in trainer.get_representation_loss(...) and trainer.update_c(...)
Add appropriate tests in test_classifier.py (for archi) and test_trainer.py(for losses)

Painter is broken

Painter is generating garbage currently

Add the WD2 data to the simulated data

We have 500 image pairs (+ masks) that we could be using to supplement our Unity dataset.

The question is now whether this data is similar enough to the Unity dataset to be used in the same domain, or whether it warrants a separate domain.

I feel that given the utter heterogeneity of our real dataset (Mapillary dashcam vs. GSV data vs. user data), it's ok to have a certain amount of heterogeneity in the simulated data too. But this needs to be tested, either by plotting the distribution of Unity vs. WD2 (e.g. embedding them using Inception?) or doing some ablation studies to see how it can complement the Unity data.

Create json files with our data on the cluster

in /network/tmp1/ccai/data/omnigan

Label smoothin in lsgan?

Do we need to have soft (and flipped?) labels when using the lsgan objective?

Test / fix resume function

test and check that trainer.resume()

generator architectures => Semantic Segmentation Head

create UNet structure

add reconstruction criterion for z

z = E(x) <> z = E(G(x))

About bit conditioning

So bit conditioning is "a way to share weights. Instead of having domain-specific weights, you share weights and "choose" a path according to that domain signal, encoded in the "bit" "

I'm not sure I understand why cond_nc is initialized twice in the SpadeTranslationDecoder, but more importantly, why it would be initialized to 2 when bit-conditioning and 0 else ?

class SpadeTranslationDecoder(SpadeDecoder):
    def __init__(self, latent_shape, opts):
        self.bit = None
        self.use_bit_conditioning = opts.gen.t.use_bit_conditioning
        cond_nc = 4  # 4 domains => 4-channel bitmap
        cond_nc = 2 if self.use_bit_conditioning else 0  # 2 domains => 2-channel bitmap