baudm / monet-pytorch Goto Github PK

View Code? Open in Web Editor NEW

89.0 3.0 23.0 8.03 MB

Burgess et al. "MONet: Unsupervised Scene Decomposition and Representation"

Home Page: https://arxiv.org/abs/1901.11390

License: Other

Python 94.78% TeX 1.06% Shell 3.04% MATLAB 1.11%

monet-pytorch's Introduction

MONet in PyTorch

We provide a PyTorch implementation of MONet.

This project is built on top of the CycleGAN/pix2pix code written by Jun-Yan Zhu and Taesung Park, and supported by Tongzhou Wang.

Note: The implementation is developed and tested on Python 3.7 and PyTorch 1.1.

Implementation details

Decoder Negative Log-Likelihood (NLL) loss

where I is the number of pixels in the image, and K is the number of mixture components. The inner term of the loss function is implemented using torch.logsumexp(). Each pixel of the decoder output is assumed to be iid Gaussian, where sigma (the "component scale") is a fixed scalar (See Section B.1 of the Supplementary Material).

Test Results

CLEVR 64x64 @ 160 epochs

The first three rows correspond to the Attention network outputs (masks), raw Component VAE (CVAE) outputs, and the masked CVAE outputs, respectively. Each column corresponds to one of the K mixture components.

For the fourth row, the first image is the ground truth while the second one is the composite image created by the pixel-wise addition of the K component images (third row).

Prerequisites

Linux or macOS (not tested)
Python 3.7
CPU or NVIDIA GPU + CUDA 10 + CuDNN

Getting Started

Installation

Clone this repo:

git clone https://github.com/baudm/MONet-pytorch.git
cd MONet-pytorch

Install [PyTorch](http://pytorch.org and) 1.1+ and other dependencies (e.g., torchvision, visdom and dominate).
- For pip users, please type the command pip install -r requirements.txt.
- For Conda users, we provide a installation script ./scripts/conda_deps.sh. Alternatively, you can create a new Conda environment using conda env create -f environment.yml.
- For Docker users, we provide the pre-built Docker image and Dockerfile. Please refer to our Docker page.

MONet train/test

Download a MONet dataset (e.g. CLEVR):

wget -cN https://dl.fbaipublicfiles.com/clevr/CLEVR_v1.0.zip

To view training results and loss plots, run python -m visdom.server and click the URL http://localhost:8097.
Train a model:

python train.py --dataroot ./datasets/CLEVR_v1.0 --name clevr_monet --model monet

To see more intermediate results, check out ./checkpoints/clevr_monet/web/index.html.

To generate a montage of the model outputs like the ones shown above:

./scripts/test_monet.sh
./scripts/generate_monet_montage.sh

Apply a pre-trained model

Download pretrained weights for CLEVR 64x64:

./scripts/download_monet_model.sh clevr

monet-pytorch's People

Contributors

Stargazers

Watchers

monet-pytorch's Issues

question about the scope

https://github.com/baudm/MONet-pytorch/blob/75239c6e74e4947b6728ffb615e8e81b24718a7e/models/monet_model.py#L86C33-L86C33

log_s_k += -alpha_logits_k + log_alpha_k
why is scope computed this way, it is not equivalent to s_k = s_k * (1-alpha_k) in the paper

How to change the datasets to train monet?

I want to use my own datasets to train Monet? Could you give me some suggestions? So appreciated!

questions about using code base: Monet-pytorch

Hi Darwin, I hope you are doing well. I have been using your codebase Genesis for several days but I got into some problems when I use my own dataset. In particular, below is the screenshot of my result when I use Monet. The desired behavior is that each object is mapped to one corresponding mask and representation slot, which is indeed the case when I use the dSprite dataset. However, when I use my own dataset, all objects are mapped to the same mask and the same slot. Could you please point out what's going on and which parameters I should change? Thanks in advance.

Loss function of the monet

In the file monet_model.py
I believe line 123 should be
self.loss_mask = self.criterionKL(self.m_tilde_logits.softmax(dim=1), self.m)
instead of
self.loss_mask = self.criterionKL(self.m_tilde_logits.log_softmax(dim=1), self.m)
since self.m is in the probability space and m_tilde should also be in the probability space not log-prob right?

AttributeError: Can't pickle local object 'StringEncoder.<locals>.EncodeField'

Windows 10, torch version 1.5.1

AttributeError Traceback (most recent call last)
in
2 import dill
3 #from pathos.multiprocessing import ProcessingPool as Pool
----> 4 tr_it = iter(train_dataloader)
5 progress_bar = tqdm(range(cfg["train_params"]["max_num_steps"]))
6 losses_train = []

c:\users\dilip\anaconda3\envs\tf-gpu\lib\site-packages\torch\utils\data\dataloader.py in iter(self)
277 return _SingleProcessDataLoaderIter(self)
278 else:
--> 279 return _MultiProcessingDataLoaderIter(self)
280
281 @Property

c:\users\dilip\anaconda3\envs\tf-gpu\lib\site-packages\torch\utils\data\dataloader.py in init(self, loader)
717 # before it starts, and del tries to join but will get:
718 # AssertionError: can only join a started process.
--> 719 w.start()
720 self._index_queues.append(index_queue)
721 self._workers.append(w)

c:\users\dilip\anaconda3\envs\tf-gpu\lib\multiprocessing\process.py in start(self)
110 'daemonic processes are not allowed to have children'
111 _cleanup()
--> 112 self._popen = self._Popen(self)
113 self._sentinel = self._popen.sentinel
114 # Avoid a refcycle if the target function holds an indirect

c:\users\dilip\anaconda3\envs\tf-gpu\lib\multiprocessing\context.py in _Popen(process_obj)
221 @staticmethod
222 def _Popen(process_obj):
--> 223 return _default_context.get_context().Process._Popen(process_obj)
224
225 class DefaultContext(BaseContext):

c:\users\dilip\anaconda3\envs\tf-gpu\lib\multiprocessing\context.py in _Popen(process_obj)
320 def _Popen(process_obj):
321 from .popen_spawn_win32 import Popen
--> 322 return Popen(process_obj)
323
324 class SpawnContext(BaseContext):

c:\users\dilip\anaconda3\envs\tf-gpu\lib\multiprocessing\popen_spawn_win32.py in init(self, process_obj)
87 try:
88 reduction.dump(prep_data, to_child)
---> 89 reduction.dump(process_obj, to_child)
90 finally:
91 set_spawning_popen(None)

c:\users\dilip\anaconda3\envs\tf-gpu\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
58 def dump(obj, file, protocol=None):
59 '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60 ForkingPickler(file, protocol).dump(obj)
61
62 #

AttributeError: Can't pickle local object 'StringEncoder..EncodeField'

Can't pickle local object

Hi,

I am getting the following error when I run it:

WARNING:root:Setting up a new session...
create web directory ./checkpoints\clevr_monet\web...
Traceback (most recent call last):
  File "train.py", line 43, in <module>
    for i, data in enumerate(dataset):  # inner loop within one epoch
  File "F:\Documents\WinPython\MONet-pytorch\data\__init__.py", line 90, in __iter__
    for i, data in enumerate(self.dataloader):
  File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__
    return _DataLoaderIter(self)
  File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__
    w.start()
  File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object '_get_transforms.<locals>.<lambda>'

(tf-gpu) F:\Documents\WinPython\MONet-pytorch>Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

It seems like the pickling within the multiprocessing library is causing the crash. I would appreciate any ideas on how to fix it.