Git Product home page Git Product logo

monet-pytorch's Introduction

MONet in PyTorch

We provide a PyTorch implementation of MONet.

This project is built on top of the CycleGAN/pix2pix code written by Jun-Yan Zhu and Taesung Park, and supported by Tongzhou Wang.

Note: The implementation is developed and tested on Python 3.7 and PyTorch 1.1.

Implementation details

Decoder Negative Log-Likelihood (NLL) loss

Decoder NLL loss

where I is the number of pixels in the image, and K is the number of mixture components. The inner term of the loss function is implemented using torch.logsumexp(). Each pixel of the decoder output is assumed to be iid Gaussian, where sigma (the "component scale") is a fixed scalar (See Section B.1 of the Supplementary Material).

Test Results

CLEVR 64x64 @ 160 epochs

The first three rows correspond to the Attention network outputs (masks), raw Component VAE (CVAE) outputs, and the masked CVAE outputs, respectively. Each column corresponds to one of the K mixture components.

For the fourth row, the first image is the ground truth while the second one is the composite image created by the pixel-wise addition of the K component images (third row).

CLEVR 1

CLEVR 2

Prerequisites

  • Linux or macOS (not tested)
  • Python 3.7
  • CPU or NVIDIA GPU + CUDA 10 + CuDNN

Getting Started

Installation

  • Clone this repo:
git clone https://github.com/baudm/MONet-pytorch.git
cd MONet-pytorch
  • Install [PyTorch](http://pytorch.org and) 1.1+ and other dependencies (e.g., torchvision, visdom and dominate).
    • For pip users, please type the command pip install -r requirements.txt.
    • For Conda users, we provide a installation script ./scripts/conda_deps.sh. Alternatively, you can create a new Conda environment using conda env create -f environment.yml.
    • For Docker users, we provide the pre-built Docker image and Dockerfile. Please refer to our Docker page.

MONet train/test

  • Download a MONet dataset (e.g. CLEVR):
wget -cN https://dl.fbaipublicfiles.com/clevr/CLEVR_v1.0.zip
  • To view training results and loss plots, run python -m visdom.server and click the URL http://localhost:8097.
  • Train a model:
python train.py --dataroot ./datasets/CLEVR_v1.0 --name clevr_monet --model monet

To see more intermediate results, check out ./checkpoints/clevr_monet/web/index.html.

To generate a montage of the model outputs like the ones shown above:

./scripts/test_monet.sh
./scripts/generate_monet_montage.sh

Apply a pre-trained model

  • Download pretrained weights for CLEVR 64x64:
./scripts/download_monet_model.sh clevr

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.