Git Product home page Git Product logo

daft's Introduction

DAFT

Code for the NeurIPS 2019 paper: "Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning"


Comparison between MAC and DAFT MAC

Dataset Preparation

For both CLEVR and GQA, we generally followed Hudson et al. However, pytorch does not have thread-safe dataloader for hdf5 file. So we split the hdf5 into single files for faster training and thread-safe data loading.

  • CLEVR will space about 95GB after extraction (features 75GB, images 18GB, annots+misc <2GB).
  • GQA will space about 140GB after extraction (features 115GB, images 21GB, annots+misc <4GB).

CLEVR

  1. Download CLEVR dataset (skip this step if you don't need visualization)
$ export DATASET_ROOT={Whatever you want}
$ cd $DATASET_ROOT
$ wget https://dl.fbaipublicfiles.com/clevr/CLEVR_v1.0.zip
$ unzip CLEVR_v1.0.zip
$ mv CLEVR_v1.0 clevr
  1. Download preprocessed features and annotations
    (follow [Hudson et al] if you want preprocess images and annotations for yourself)
$ cd $DATASET_ROOT/clevr
$ wget -O features.tar.gz https://www.dropbox.com/s/sis6lmrrx0ze3z1/features.tar.gz?dl=0
$ wget -O annots.tar.gz https://www.dropbox.com/s/5rto93ddayol949/annots.tar.gz?dl=0
$ tar -xvzf features.tar.gz
$ tar -xvzf annots.tar.gz

GQA

  1. Download GQA images (skip this step if you don't need visualization)
$ export DATASET_ROOT={Whatever you want}
$ cd $DATASET_ROOT
$ mkdir gqa
$ cd gqa
$ wget https://nlp.stanford.edu/data/gqa/images.zip
$ unzip images.zip
  1. Download preprocessed features and annotations
    (follow instructions of [Hudson et al] if you want preprocess images and annotations for yourself)
$ cd $DATASET_ROOT/gqa
$ wget -O features.tar.gz https://www.dropbox.com/s/ag0te9o56pz30jk/features.tar.gz?dl=0
$ wget -O annots.tar.gz https://www.dropbox.com/s/t6bhts8g3xkslyu/annots.tar.gz?dl=0
$ tar -xvzf features.tar.gz
$ tar -xvzf annots.tar.gz

Code

For reproducibility of our work, we employed experimentation framework Sacred and followed their command-line interface.

Requirements

  • Python 3.6+ (for f-string)
  • Pytorch 1.2.0+
  • See requirements.txt for the rest.

Training

  • $ python train.py with [dataset_name] root=[dataset_root] use_daft=[True|False] max_step=[step]
  • ex) $ python train.py with clevr root=$DATASET_ROOT use_daft=True max_step=4
  • ex) $ python train.py with gqa root=$DATASET_ROOT use_daft=False max_step=5

For every epoch, model weights will be saved at result/model/{daftmac|mac}_{clevr|gqa}_step{max_step}_{load_seed}.

Evaluation and Visualization

Put the model weight into result/model/{daftmac|mac}_{clevr|gqa}_step{max_step}_{load_seed} (e.g., result/model/daftmac_gqa_step4_387678158/checkpoint_19.model) and run

  • Evaluation

    • $ python evaluation.py with [dataset_name] root=[dataset_root] use_daft=[True|False] max_step=[step] load_seed=[seed]
    • ex) $ python evaluation.py with clevr root=$DATASET_ROOT use_daft=False max_step=2 load_seed=608494298
    • ex) $ python evaluation.py with gqa root=$DATASET_ROOT use_daft=True max_step=12 load_seed=305083948
  • Visualization

    • $ python visualize.py with [dataset_name] root=[dataset_root] use_daft=[True|False] max_step=[step] load_seed=[seed]
    • ex) $ python visualize.py with clevr root=$DATASET_ROOT use_daft=False max_step=2 load_seed=608494298
    • ex) $ python visualize.py with gqa root=$DATASET_ROOT use_daft=True max_step=12 load_seed=305083948

Citation

If you use any part of this code for your research, please cite our paper.

@inproceedings{kim2019learning,
  title={Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning},
  author={Kim, Wonjae and Lee, Yoonho},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2019}
}

Contact for Issues

References & Opensources

  1. MAC-Network
  2. GQA
  3. Neural ODE & ODE solvers

License

This software is licensed under the Apache 2 license, quoted below.

Copyright 2019 Kakao Corp. http://www.kakaocorp.com

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this project except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

daft's People

Contributors

dandelin avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.