Git Product home page Git Product logo

servicenow / highres-net Goto Github PK

View Code? Open in Web Editor NEW
272.0 16.0 51.0 936 KB

Pytorch implementation of HighRes-net, a neural network for multi-frame super-resolution, trained and tested on the European Space Agency’s Kelvin competition. This is a ServiceNow Research project that was started at Element AI.

Home Page: https://www.elementai.com/news/2019/computer-enhance-please

License: Other

Dockerfile 0.04% Jupyter Notebook 95.48% Python 4.48%
super-resolution satellite-imagery proba-v esa deep-learning computer-vision mfsr highres-net multi-frame

highres-net's Issues

Calculation of val_score

So val_score is calculated using shift_cPSNR.

Some things need explanation:

  1. Why best_score starts from 100? and why shift_cPSNR is reduced from that value?
  2. Why val_score is normalized by size of validation dataset after looping over srs.shape values? What is its effect? Being reduced from 100 to begin with?
  3. Why the final score (after calculation) ends up being negative? What is this metric?

Thanks

RuntimeError: CUDA out of memory

Even by reducing the batch size the fusion model doesn't fit into a 16G GPU.

I tried also to split the RecuversiveNet fuse onto the first 16GB GPUGPUID 0 and get it into the forward pass with GPUID 1.

but tensor needs to be collocated.
Can you help me either reduce the mem issue to fit into a 16G GPU or split the computation accross multi devices ?

thanks a lot

ShiftNet loss

Hello,
First of all, thank you for publishing the code for this model, it is much appreciated.
I have a question related to the ShiftNet loss, according to the paper (page 6, equation 5) the ShiftNet loss should be the L2 norm of the shifts. However looking at the line in train.py where this is calculated
https://github.com/ElementAI/HighRes-net/blob/aa022561c55ee3875b7d2e7837d37e620bf19315/src/train.py#L187
the ShiftNet loss is torch.mean(shifts)**2, which is not the same thing. One of the potential issues with the loss defined as it is, is that shifts can be both positive or negative and so they can cancel out across the batches.

Is this a bug or was this intended? Were the results presented in the paper obtained with the loss as is defined in the code or as is defined in the paper?

Thank you!

training problem because of the different sizes

Hello!
I have a problem running the script, I use your docker-compose and getting this:
File "src/train.py", line 309, in
    main(config)
  File "src/train.py", line 294, in main
    trainAndGetBestModel(fusion_model, regis_model, optimizer, dataloaders, baseline_cpsnrs, config)
  File "src/train.py", line 179, in trainAndGetBestModel
    reference=hrs[:, offset:(offset + 128), offset:(offset + 128)].view(-1, 1, 128, 128))
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead

If i am changing as the error suggests, I am getting the tensors of an incompatible size here:
lrs: tensor (batch size, views, W, H), images to shift reference: tensor (batch size, W, H), reference images to shift
-> they will be 64,1, 16,16
and 1,1,128,128

Thank you for your code!

predic

How can I predict single image
please help me

Shared representation in fusion block

Hello,
according to equation 3, the fusion is performed on intermediate views and their shared representation (g0) within an inner residual block. I studied your code and just cannot find such block anywhere. Is there a bug or the actual code is different than the one described in your paper? Or am I missing somethig here?

Thanks in advance!

torch_mask 192 vs 384

When the current patch-size of 64 is submitted to get_crop_mask from config the resultant tensor is of W,H: 192, 192 which is half the high-res image of (hr_maps) 384, 384 and causes a mismatch error at:

cropped_mask = torch_mask[0] * hr_maps # Compute

The size of tensor a (192) must match the size of tensor b (384)

Is this related to the following from the comp:

To compensate for pixel-shifts, the submitted images are cropped by a 3 pixel border, resulting in a 378x378 format. These cropped images are then evaluated at the corresponding patches around the center of the ground-truth images, with the highest cPSNR being the score. In the following, HR is the ground-truth image and SR the submitted image, both in 384x384 resolution. We denote the cropped 378x378 images as follows: for all u,v∈{0,…,6}, HRu,v is the subimage of HR with its upper left corner at coordinates (u,v) and its lower right corner at (378+u,378+v).

In this case, wouldn't the get_crop_mask have this line

mask = np.ones((1, 1, 6 * patch_size, 6 * patch_size)) # crop_mask for loss (B, C, W, H)
And multiply by 6 not 3? This runs but I'm not sure if it's correct

Pretrain-model

If possible, can you provide a pre-trained model for testing

Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 4, 202, 202]

I want to use your code with the Proba-V dataset, but I'm facing the following error.

$ python src/train.py --config config/config.json 0%| | 0/261 [00:00<?, ?it/s] 0%| | 0/400 [00:00<?, ?it/s] Traceback (most recent call last): File "[...]/HighRes-net/src/train.py", line 308, in <module> main(config) File "[...]/HighRes-net/src/train.py", line 294, in main trainAndGetBestModel(fusion_model, regis_model, optimizer, dataloaders, baseline_cpsnrs, config) File "[...]/HighRes-net/src/train.py", line 180, in trainAndGetBestModel srs_shifted = apply_shifts(regis_model, srs, shifts, device)[:, 0] File "[...]/HighRes-net/src/train.py", line 61, in apply_shifts new_images = shiftNet.transform(thetas, images, device=device) File "[...]/HighRes-net/src/DeepNetworks/ShiftNet.py", line 96, in transform new_I = lanczos.lanczos_shift(img=I.transpose(0, 1), File "[...]/HighRes-net/src/lanczos.py", line 96, in lanczos_shift I_s = torch.conv1d(I_padded, RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 4, 202, 202]

Here are the different values or shapes which are passed in the conv1d function :
I_padded input shape : torch.Size([1, 4, 202, 202])
k_y.shape[0] and k_x.shape[0] groups number : 4
k_y and k_x weights shapes : torch.Size([4, 1, 7, 1]) (and torch.Size([4, 1, 1, 7]))
[k_y.shape[2] // 2, 0] and [0, k_x.shape[3] // 2] padding values : [3, 0] and [3, 0]

I used the default config.json, except for the following parameters.

  • "batch_size": 4
  • "min_L": 4
  • "n_views": 16
    But I receive similar errors keeping the default values.

I tried to squeeze the 1st dim of img, the 2nd of weights and to specify a simple int value for padding to avoid the different error messages, but all I finally had is this new RuntimeError.
'Given groups=4, weight of size [4, 7, 1], expected input[4, 202, 202] to have 28 channels, but got 202 channels instead'

Any clue to help me?

Lanczos code crashes

Hi,

Using PROBA-V dataset, the code crashes in lanczos file with this error:

RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 16, 202, 202]

Why insight on what might cause this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.