tamarott / asapnet Goto Github PK

View Code? Open in Web Editor NEW

119.0 3.0 13.0 55 KB

License: Other

Python 98.71% Shell 1.29%

asapnet's Introduction

Image Translation with ASAPNets

Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

Webpage | Paper | Video

Installation

install requirements:

pip install -r requirements.txt

Code Structure

The code is heavily based on the official implementation of SPADE, and therefore has the saome structure:

train.py, test.py: the entry point for training and testing.
trainers/pix2pix_trainer.py: harnesses and reports the progress of training.
models/pix2pix_model.py: creates the networks, and compute the losses.
models/networks/: defines the architecture of all models.
options/: creates option lists using argparse package. More individuals are dynamically added in other files as well. Please see the section below.
data/: defines the class for loading images and label maps.

The ASAPNets generator is implementaed in:

models/networks/generator: defines the architecture of the ASAPNets generator.

Dataset Preparation

facades

run:

cd data 
bash facadesHR_download_and_extract.sh

This will extract the facades full resolution images into datasets/facadesHR.

cityscapes

download the dataset into datasets/cityscapes and arrange in folders: train_images, train_labels, val_images, val_labels

Generating Images Using Pretrained Models

Pretraned models can be downloaded from here. Save the models under the checkpoints/ folder. Images can be generated using the command:

# Facades 512
bash test_facades512.sh

# Facades 1024
bash test_facades512.sh

# Cityscapes
bash test_cityscapes.sh

The outputs images will appear at the./results/ folder.

Training New Models

New models can be trained with the following commands. Prepare dataset in the ./datasets/ folder. Arrange in folders: train_images, train_labels, val_images, val_labels . For custom datasets, the easiest way is to use ./data/custom_dataset.py by specifying the option --dataset_mode custom, along with --label_dir [path_to_labels] --image_dir [path_to_images]. You also need to specify options such as --label_nc for the number of label classes in the dataset, --contain_dontcare_label to specify whether it has an unknown label, or --no_instance to denote the dataset doesn't have instance maps.

Run:

python train.py --name [experiment_name] --dataset_mode custom --label_dir [path_to_labels] -- image_dir [path_to_images] --label_nc [num_labels]

There are many additional options you can specify, please explore the ./options files. To specify the number of GPUs to utilize, use --gpu_ids.

Testing

Testing is similar to testing pretrained models.

python test.py --name [name_of_experiment] --dataset_mode [dataset_mode] --dataroot [path_to_dataset]

you can load the parameters used from training by specifying --load_from_opt_file.

Acknowledgments

This code is heavily based on the official implementation of SPADE. We thank the authors for sharing their code publicly!

License

Attribution-NonCommercial-ShareAlike 4.0 International (see file).

Citation

@inproceedings{RottShaham2020ASAP,
  title={Spatially-Adaptive Pixelwise Networks for Fast Image Translation},
  author={Rott Shaham, Tamar and Gharbi, Michael and Zhang, Richard and Shechtman, Eli and Michaeli, Tomer},
  booktitle={Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

asapnet's People

Contributors

Stargazers

Watchers

Forkers

templeblock pkurainbow nisargshah1999 zwh840282584 giamagu atonderski ayankumarbhunia nomow jiaminwoo liuliqin midsommar-2019 usernamenna alexriedel1

asapnet's Issues

Could quantization and distillation improve ASAPNet model inference speed further?

Thanks!

sync_batchnorm missing

Looks like this file might be missing from source: ```ModuleNotFoundError: No module named 'models.networks.sync_batchnorm'

Train and Test codes for ASAPNets

Do you have train and test codes for ASAPNets? The current train and test codes in the repo seem to be written for PixtoPix model only.

How did you compare speed with pix2pixHD in the paper?

Because pix2pixHD need to train global and local model, how did you train the pix2pixHD model when you comparing speed with your method?
Did you train pix2pixHD global model in 256*256?

Wasn't there any artifact due to discontinuity at the border of upsampled parameters?

Since the nearest neighbor method is used for the parameter upsampling, it seems like the discontinuity at the border of parameters is inevitable.
However, according to the qualitative results reported in the paper, those 16x16 grid artifacts are not observed.

Wasn't there any worry or concern about this when you design the network?

What are options to train for depth estimation like Figure 10 in your paper

Hello, I want to use your wonderful work for depth estimation.
But I could not start training with some errors.
I tried this command

python train.py --name depthEstimation --dataset_mode custom --label_dir [monocular_Images_dir] --image_dir [depth_Images_dir] --no_instance_edge --no_instance_dist --no_one_hot

But I could not start training with this error.

RuntimeError: Given groups=1, weight of size 64 13 3 3, expected input[1, 3, 256, 256] to have 13 channels, but got 3 channels instead

The dataset images size is (512,512).

So please tell me options when you trained the depth estimation model with NYU dataset.

Thank you!

About the comparison with not spatially varying f_p model

Thank you for the awesome work again.
This work is very inspiring.

I have a question about the ablation study on the spatially-variant operation (Figure 9 (c) in the paper).
Does this mean that f(x_p, p, phi_p; phi) is less effective than f(x_p, p; phi_p)? (where phi is spatially-invariant learnable parameter).
If so, why?

Note1: In the case of f(x_p, p, phi_p; phi), the dimension for the phi_p should be much smaller since it now works as an input to the network.
Note2: If we use f(x_p, p, phi_p; phi), I think it would be possible to find an analogy with LIIF model (which tackles arbitrary-scale SR problem). In other words, reversely, I think it is also possible to apply this paper's pixelwise MLP method to arbitrary-scale SR problem if directly predicting the MLP parameters is more efficient than putting the the feature as an input for the coordinate-based MLP.

Error on custom dataset

I tried with celebAmask-HQ dataset (MaskGAN) with 19 lebel_nc and got a strange error:
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
terminate called after throwing an instance of 'c10::Error'
what(): CUDA error: device-side assert triggered

after searching for a while i found that someone posted the same kind of error on "pix2pixHD" model but i am yet unable to solve it. kindly help if anyone knows. BTW Facades dataset work fine so there is no problem with CudNN or CUDA i think.

Why are my trained weights smaller than the provided pre-trained weights by 1Mb?

The weights I trained on the facedes dataset were 327MB and predicted nothing, while the pre-trained weights provided by the authors were 328Mb and were able to predict successfully.