servicenow / highres-net Goto Github PK

Pytorch implementation of HighRes-net, a neural network for multi-frame super-resolution, trained and tested on the European Space Agency’s Kelvin competition. This is a ServiceNow Research project that was started at Element AI.

Home Page: https://www.elementai.com/news/2019/computer-enhance-please

License: Other

Dockerfile 0.04% Jupyter Notebook 95.48% Python 4.48%

super-resolution satellite-imagery proba-v esa deep-learning computer-vision mfsr highres-net multi-frame

highres-net's Introduction

ServiceNow completed its acquisition of Element AI on January 8, 2021. All references to Element AI in the materials that are part of this project should refer to ServiceNow.

HighRes-net: Multi Frame Super-Resolution by Recursive Fusion

Pytorch implementation of HighRes-net, a neural network for multi frame super-resolution (MFSR), trained and tested on the European Space Agency's Kelvin competition.

Computer, enhance please!

source: ElementAI blog post Computer, enhance please!

credits: ESA Kelvin Competition

A recipe to enhance the vision of the ESA satellite Proba-V

Hardware:

The default config should work on a machine with:

GPU: Nvidia Tesla v100, memory 32G

Driver version: CUDA 10.0

CPU: memory 8G to enable running jupyter notebook server and tensorboard server

If your available GPU memory is less than 32G, try following to reduce the memory usage

(1) Work with smaller batches (batch_size in config.json)

(2) Work with less low-res views (n_views and min_L in config.json, min_L is minimum number of views (n_views))

According to our experiments, we estimated the memory consumption (in GB) given batch_size and n_views

`batch_size` \ `n_views` and `min_L`	32	16	4
32	27	15	6
16	15	8	4

0. Setup python environment

Setup a python environment and install dependencies, we need python version >= 3.6.8

pip install -r requirements.txt

1. Load data and save clearance

Download the data from the Kelvin Competition and unzip it under data/
Run the save_clearance script to precompute clearance scores for low-res views

python src/save_clearance.py --prefix /path/to/ESA_data

2. Train model and view logs (with TensorboardX)

Train a model with default config

python src/train.py --config config/config.json

View training logs with tensorboardX

tensorboard --logdir='tb_logs/'

3. Test model

Open jupyter notebook and run notebooks/test_model.ipynb
We assume the jupyter notebook server runs in project root directory. If you start it in somewhere else, please change the file path in notebooks accordingly

You could also use docker-compose file to start jypyter notebook and tensorboard

Authors

HighRes-net is based on work by team Rarefin, an industrial-academic partnership between ElementAI AI for Good lab in London (Zhichao Lin, Michel Deudon, Alfredo Kalaitzis, Julien Cornebise) and Mila in Montreal (Israel Goytom, Kris Sankaran, Md Rifat Arefin, Samira E. Kahou, Vincent Michalski)

License

This repo is under apache-2.0 and no harm license, please refer our license file

Acknowledgments

Special thanks to Laure Delisle, Grace Kiser, Alexandre Lacoste, Yoshua Bengio, Peter Henderson, Manon Gruaz, Morgan Guegan and Santiago Salcido for their support.

We are grateful to Marcus Märtens, Dario Izzo, Andrej Krzic and Daniel Cox from the Advanced Concept Team of the ESA for organizing this competition and assembling the dataset — we hope our solution will contribute to your vision for scalable environmental monitoring.

highres-net's People

Contributors

Stargazers

Watchers

highres-net's Issues

Shared representation in fusion block

Hello,
according to equation 3, the fusion is performed on intermediate views and their shared representation (g0) within an inner residual block. I studied your code and just cannot find such block anywhere. Is there a bug or the actual code is different than the one described in your paper? Or am I missing somethig here?

Thanks in advance!

RuntimeError: CUDA out of memory

Even by reducing the batch size the fusion model doesn't fit into a 16G GPU.

I tried also to split the RecuversiveNet fuse onto the first 16GB GPUGPUID 0 and get it into the forward pass with GPUID 1.

but tensor needs to be collocated.
Can you help me either reduce the mem issue to fit into a 16G GPU or split the computation accross multi devices ?

thanks a lot

Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 4, 202, 202]

I want to use your code with the Proba-V dataset, but I'm facing the following error.

$ python src/train.py --config config/config.json 0%| | 0/261 [00:00<?, ?it/s] 0%| | 0/400 [00:00<?, ?it/s] Traceback (most recent call last): File "[...]/HighRes-net/src/train.py", line 308, in <module> main(config) File "[...]/HighRes-net/src/train.py", line 294, in main trainAndGetBestModel(fusion_model, regis_model, optimizer, dataloaders, baseline_cpsnrs, config) File "[...]/HighRes-net/src/train.py", line 180, in trainAndGetBestModel srs_shifted = apply_shifts(regis_model, srs, shifts, device)[:, 0] File "[...]/HighRes-net/src/train.py", line 61, in apply_shifts new_images = shiftNet.transform(thetas, images, device=device) File "[...]/HighRes-net/src/DeepNetworks/ShiftNet.py", line 96, in transform new_I = lanczos.lanczos_shift(img=I.transpose(0, 1), File "[...]/HighRes-net/src/lanczos.py", line 96, in lanczos_shift I_s = torch.conv1d(I_padded, RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 4, 202, 202]

Here are the different values or shapes which are passed in the conv1d function :
I_padded input shape : torch.Size([1, 4, 202, 202])
k_y.shape[0] and k_x.shape[0] groups number : 4
k_y and k_x weights shapes : torch.Size([4, 1, 7, 1]) (and torch.Size([4, 1, 1, 7]))
[k_y.shape[2] // 2, 0] and [0, k_x.shape[3] // 2] padding values : [3, 0] and [3, 0]

I used the default config.json, except for the following parameters.

"batch_size": 4
"min_L": 4
"n_views": 16
But I receive similar errors keeping the default values.

I tried to squeeze the 1st dim of img, the 2nd of weights and to specify a simple int value for padding to avoid the different error messages, but all I finally had is this new RuntimeError.
'Given groups=4, weight of size [4, 7, 1], expected input[4, 202, 202] to have 28 channels, but got 202 channels instead'

Any clue to help me?

predic

How can I predict single image
please help me

Pretrain-model

If possible, can you provide a pre-trained model for testing

Lanczos code crashes

Hi,

Using PROBA-V dataset, the code crashes in lanczos file with this error:

RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 16, 202, 202]

Why insight on what might cause this?

Calculation of val_score

So val_score is calculated using shift_cPSNR.

Some things need explanation:

Why best_score starts from 100? and why shift_cPSNR is reduced from that value?
Why val_score is normalized by size of validation dataset after looping over srs.shape values? What is its effect? Being reduced from 100 to begin with?
Why the final score (after calculation) ends up being negative? What is this metric?

Thanks

torch_mask 192 vs 384

When the current patch-size of 64 is submitted to get_crop_mask from config the resultant tensor is of W,H: 192, 192 which is half the high-res image of (hr_maps) 384, 384 and causes a mismatch error at:

cropped_mask = torch_mask[0] * hr_maps # Compute

The size of tensor a (192) must match the size of tensor b (384)

Is this related to the following from the comp:

To compensate for pixel-shifts, the submitted images are cropped by a 3 pixel border, resulting in a 378x378 format. These cropped images are then evaluated at the corresponding patches around the center of the ground-truth images, with the highest cPSNR being the score. In the following, HR is the ground-truth image and SR the submitted image, both in 384x384 resolution. We denote the cropped 378x378 images as follows: for all u,v∈{0,…,6}, HRu,v is the subimage of HR with its upper left corner at coordinates (u,v) and its lower right corner at (378+u,378+v).

In this case, wouldn't the get_crop_mask have this line

mask = np.ones((1, 1, 6 * patch_size, 6 * patch_size)) # crop_mask for loss (B, C, W, H)
And multiply by 6 not 3? This runs but I'm not sure if it's correct

ShiftNet loss

Hello,
First of all, thank you for publishing the code for this model, it is much appreciated.
I have a question related to the ShiftNet loss, according to the paper (page 6, equation 5) the ShiftNet loss should be the L2 norm of the shifts. However looking at the line in train.py where this is calculated
https://github.com/ElementAI/HighRes-net/blob/aa022561c55ee3875b7d2e7837d37e620bf19315/src/train.py#L187
the ShiftNet loss is torch.mean(shifts)**2, which is not the same thing. One of the potential issues with the loss defined as it is, is that shifts can be both positive or negative and so they can cancel out across the batches.

Is this a bug or was this intended? Were the results presented in the paper obtained with the loss as is defined in the code or as is defined in the paper?

Thank you!

training problem because of the different sizes

Hello!
I have a problem running the script, I use your docker-compose and getting this:
File "src/train.py", line 309, in
main(config)
File "src/train.py", line 294, in main
trainAndGetBestModel(fusion_model, regis_model, optimizer, dataloaders, baseline_cpsnrs, config)
File "src/train.py", line 179, in trainAndGetBestModel
reference=hrs[:, offset:(offset + 128), offset:(offset + 128)].view(-1, 1, 128, 128))
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead

If i am changing as the error suggests, I am getting the tensors of an incompatible size here:
lrs: tensor (batch size, views, W, H), images to shift reference: tensor (batch size, W, H), reference images to shift
-> they will be 64,1, 16,16
and 1,1,128,128

Thank you for your code!