Git Product home page Git Product logo

s2am's Introduction

Spatial-Separated Attention Module (S²AM)

Arxiv | Demo

This repo contains the PyTorch implement of the following paper:

    Improving the Harmony of the Composite Image by Spatial-Separated Attention Module
    Xiaodong Cun and Chi-Man Pun
    University of Macau
    Trans. on Image Processing, vol. 29, pp. 4759-4771, 2020.

News

  • 2020-12-18 The pretrained model on iHarmony5 dataset are released.
  • 2020-12-18 SCOCO and SAdobe5K are released.
  • 2020-06-16 Pretrained model(S²AD) and online demo are released.

Abstract

Image composition is one of the most important applications in image processing. However, the inharmonious appearance between the spliced region and background degrade the quality of image. Thus, we address the problem of Image Harmonization: Given a spliced image and the mask of the spliced region, we try to harmonize the ''style'' of the pasted region with the background (non-spliced region). Previous approaches have been focusing on learning directly by the neural network. In this work, we start from an empirical observation: the differences can only be found in the spliced region between the spliced image and the harmonized result while they share the same semantic information and the appearance in the non-spliced region. Thus, in order to learn the feature map in the masked region and the others individually, we propose a novel attention module named Spatial-Separated Attention Module (S²AM). Furthermore, we design a novel image harmonization framework by inserting the S²AM in the coarser low level features of the Unet structure by two different ways. Besides image harmonization, we make a big step for harmonizing the composite image without the specific mask under previous observation. The experiments show that the proposed S²AM performs better than other state-of-the-art attention modules in our task. Moreover, we demonstrate the advantages of our model against other state-of-the-art image harmonization methods via criteria from multiple points of view.

Some Results

results sample

Requirements

The code is tested on the python 3.6 and PyTorch v0.4+ under Ubuntu 18.04 OS.
You need to install all the requirements from pip.
Anaconda is highly recommendation for install the dependences.

git clone https://github.com/vinthony/s2am.git
cd s2am
pip install -r requirements.txt

Datasets

We train the network under two different synthesized datasets.

Train

All the options of the training can be found in options.py

# train the S2AD methods 
chmod +x ./example/train_harmorization_s2ad.sh && ./example/train_harmorization_s2ad.sh

# train the S2ASC methods .
chmod +x ./example/train_harmorization_s2asc.sh && ./example/train_harmorization_s2asc.sh

# train the image harmonization w/o mask task from our paper.
chmod +x ./example/train_harmorization_wo_mask.sh && ./example/train_harmorization_wo_mask.sh

you may also try our new code framework to train s2am. please refer to this link.

Visualization

We use TensorboardX to monitor the training process, just install it by the introduction of tensorboardX.

run the watching commond as :

tensorboard --logdir ./checkpoint

Demo

Local machine.

  1. clone this repo.

  2. download the pretrain models from google drive

  3. download some sample validation dataset from google drive

  4. configure the path to the dataset and pretrained model in visualize.ipynb

  5. run the notebook

Online demo

Just visit our google colab notebook.

The pretrained model and results on iHarmony5 Dataset.

We report the MAE and PSNR as shown in the original iHarmony5 paper. The pretrained model can be downloaded from here. These results are trained and evaluated using the newer version of our code framework with nothing changes to the algorithm(please refer to our new work here). All the results have been evaluated using a jupyter notebook in eval_s2am_iharmony4.ipynb, which is modified from the evaluation code in DoveNet(CVPR 2020). Notice that the original DoveNet use the total sub-datasets for training, the results report here are trained on each sub-dataset individually.

w/o global skip-connection w global skip-connection
dataset\method PSNR↑ MAE↓ PSNR↑ MAE↓
HCOCO 37.33 25.59 37.25 26.22
HAdobe5K 34.33 47.49 34.32 51.66
HFlickr 30.71 112.92 31.02 106.21
HDay2night 33.63 70.03 34.28 66.31

The Application of Spatial-Separated Attention Module (S²AM) w/o mask

Image Classification

We evaluate our method with the baseline attention module: CBAM and original ResNet in CIFAR-10 with the default setting of code in pytorch_resnet_cifar10

method Test err (Orginal) Test err (w/ CBAM) Test err (w/ S²AM)
ResNet20 8.45% 7.91% 7.60%
ResNet32 7.40% 7.07% 7.06%
ResNet44 6.96% 6.92% 6.58%
ResNet56 6.47% 6.43% 6.41%

Interactive Wartmark Removal from a region.

By regard a region as mask, Our method can use to remove the visible wartmark from the image. We generate the datasets from VOC as image and 100 famous logo as watermark region. The network trains on 70 of them and testing on the rest of them, here are some random results: 1511 1582 1654 1728

Citation

If you find our work useful in your research, please consider citing:

@misc{cun2019improving,
    title={Improving the Harmony of the Composite Image by Spatial-Separated Attention Module},
    author={Xiaodong Cun and Chi-Man Pun},
    year={2019},
    eprint={1907.06406},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

s2am's People

Contributors

dependabot[bot] avatar vinthony avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

s2am's Issues

Checkerboard effects using U-net network

Hi,

I've been analyzing the code for the unet generator in unet.py and I am able to produce impressive results, however, it seems that the ConvTranspose2D in UnetSkipConnectionBlock is causing a checkerboard effect. Do you have any indication/idea of how one should go about using bilinear upsampling instead? I've tried experimenting with this myself but I am constantly getting errors with the dimensions of my tensors.

Any assistance will be highly appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.