Git Product home page Git Product logo

asapnet's Introduction

Image Translation with ASAPNets

Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

Installation

install requirements:

pip install -r requirements.txt

Code Structure

The code is heavily based on the official implementation of SPADE, and therefore has the saome structure:

  • train.py, test.py: the entry point for training and testing.
  • trainers/pix2pix_trainer.py: harnesses and reports the progress of training.
  • models/pix2pix_model.py: creates the networks, and compute the losses.
  • models/networks/: defines the architecture of all models.
  • options/: creates option lists using argparse package. More individuals are dynamically added in other files as well. Please see the section below.
  • data/: defines the class for loading images and label maps.

The ASAPNets generator is implementaed in:

  • models/networks/generator: defines the architecture of the ASAPNets generator.

Dataset Preparation

facades

run:

cd data 
bash facadesHR_download_and_extract.sh

This will extract the facades full resolution images into datasets/facadesHR.

cityscapes

download the dataset into datasets/cityscapes and arrange in folders: train_images, train_labels, val_images, val_labels

Generating Images Using Pretrained Models

Pretraned models can be downloaded from here. Save the models under the checkpoints/ folder. Images can be generated using the command:

# Facades 512
bash test_facades512.sh

# Facades 1024
bash test_facades512.sh

# Cityscapes
bash test_cityscapes.sh

The outputs images will appear at the./results/ folder.

Training New Models

New models can be trained with the following commands. Prepare dataset in the ./datasets/ folder. Arrange in folders: train_images, train_labels, val_images, val_labels . For custom datasets, the easiest way is to use ./data/custom_dataset.py by specifying the option --dataset_mode custom, along with --label_dir [path_to_labels] --image_dir [path_to_images]. You also need to specify options such as --label_nc for the number of label classes in the dataset, --contain_dontcare_label to specify whether it has an unknown label, or --no_instance to denote the dataset doesn't have instance maps.

Run:

python train.py --name [experiment_name] --dataset_mode custom --label_dir [path_to_labels] -- image_dir [path_to_images] --label_nc [num_labels]

There are many additional options you can specify, please explore the ./options files. To specify the number of GPUs to utilize, use --gpu_ids.

Testing

Testing is similar to testing pretrained models.

python test.py --name [name_of_experiment] --dataset_mode [dataset_mode] --dataroot [path_to_dataset]

you can load the parameters used from training by specifying --load_from_opt_file.

Acknowledgments

This code is heavily based on the official implementation of SPADE. We thank the authors for sharing their code publicly!

License

Attribution-NonCommercial-ShareAlike 4.0 International (see file).

Citation

@inproceedings{RottShaham2020ASAP,
  title={Spatially-Adaptive Pixelwise Networks for Fast Image Translation},
  author={Rott Shaham, Tamar and Gharbi, Michael and Zhang, Richard and Shechtman, Eli and Michaeli, Tomer},
  booktitle={Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

asapnet's People

Contributors

tamarott avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

asapnet's Issues

sync_batchnorm missing

Looks like this file might be missing from source: ```ModuleNotFoundError: No module named 'models.networks.sync_batchnorm'

Train and Test codes for ASAPNets

Do you have train and test codes for ASAPNets? The current train and test codes in the repo seem to be written for PixtoPix model only.

What are options to train for depth estimation like Figure 10 in your paper

Hello, I want to use your wonderful work for depth estimation.
But I could not start training with some errors.
I tried this command

python train.py --name depthEstimation --dataset_mode custom --label_dir [monocular_Images_dir] --image_dir [depth_Images_dir] --no_instance_edge --no_instance_dist --no_one_hot

But I could not start training with this error.

RuntimeError: Given groups=1, weight of size 64 13 3 3, expected input[1, 3, 256, 256] to have 13 channels, but got 3 channels instead

The dataset images size is (512,512).

So please tell me options when you trained the depth estimation model with NYU dataset.

Thank you!

About the comparison with not spatially varying f_p model

Thank you for the awesome work again.
This work is very inspiring.

I have a question about the ablation study on the spatially-variant operation (Figure 9 (c) in the paper).
Does this mean that f(x_p, p, phi_p; phi) is less effective than f(x_p, p; phi_p)? (where phi is spatially-invariant learnable parameter).
If so, why?

Note1: In the case of f(x_p, p, phi_p; phi), the dimension for the phi_p should be much smaller since it now works as an input to the network.
Note2: If we use f(x_p, p, phi_p; phi), I think it would be possible to find an analogy with LIIF model (which tackles arbitrary-scale SR problem). In other words, reversely, I think it is also possible to apply this paper's pixelwise MLP method to arbitrary-scale SR problem if directly predicting the MLP parameters is more efficient than putting the the feature as an input for the coordinate-based MLP.

Error on custom dataset

I tried with celebAmask-HQ dataset (MaskGAN) with 19 lebel_nc and got a strange error:
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
terminate called after throwing an instance of 'c10::Error'
what(): CUDA error: device-side assert triggered

after searching for a while i found that someone posted the same kind of error on "pix2pixHD" model but i am yet unable to solve it. kindly help if anyone knows. BTW Facades dataset work fine so there is no problem with CudNN or CUDA i think.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.