jacobandreas / nmn2 Goto Github PK

Neural module networks

Home Page: http://arxiv.org/abs/1511.02799

License: Apache License 2.0

Python 76.43% Shell 0.25% Scheme 23.33%

nmn2's Introduction

Neural module networks

UPDATE 22 Jun 2017: Code for our end-to-end module network framework is available at https://github.com/ronghanghu/n2nmn. The n2nmn code works better and is easier to set up. Use it!

This library provides code for training and evaluating neural module networks (NMNs). An NMN is a neural network that is assembled dynamically by composing shallow network fragments called modules into a deeper structure. These modules are jointly trained to be freely composable. For a general overview to the framework, refer to:

Neural module networks. Jacob Andreas, Marcus Rohrbach, Trevor Darrell and Dan Klein. CVPR 2016.

Learning to compose neural networks for question answering. Jacob Andreas, Marcus Rohrbach, Trevor Darrell and Dan Klein. NAACL 2016.

At present the code supports predicting network layouts from natural-language strings, with end-to-end training of modules. Various extensions should be straightforward to implement—alternative layout predictors, supervised training of specific modules, etc.

Please cite the CVPR paper for the general NMN framework, and the NAACL paper for dynamic structure selection. Feel free to email me at [email protected] if you have questions. This code is released under the Apache 2 license, provided in LICENSE.txt.

Installing dependencies

You will need to build my fork of the excellent ApolloCaffe library. This fork may be found at jacobandreas/apollocaffe, and provides support for a few Caffe layers that haven't made it into the main Apollo repository. Ordinary Caffe users: note that you will have to install the runcython Python module in addition to the usual Caffe dependencies.

One this is done, update APOLLO_ROOT at the top of run.sh to point to your ApolloCaffe installation.

You will also need to install the following packages:

colorlogs, sexpdata

Downloading data

All experiment data should be placed in the data directory.

VQA

In data, create a subdirectory named vqa. Follow the VQA setup instructions to install the data into this directory. (It should have children Annotations, Images, etc.)

We have modified the structure of the VQA Images directory slightly. Images should have two subdirectories, raw and conv. raw contains the original VQA images, while conv contains the result of preprocessing these images with a 16-layer VGGNet as described in the paper. Every file in the conv directory should be of the form COCO_{SETNAME}_{IMAGEID}.jpg.npz, and contain a 512x14x14 image map in zipped numpy format. Here's a gist with the code I use for doing the extraction.

GeoQA

Download the GeoQA dataset from the LSP website, and unpack it into data/geo.

Parsing questions

Every dataset fold should contain a file of parsed questions, one per line, formatted as S-expressions. If multiple parses are provided, they should be semicolon-delimited. As an example, for the question "is the train modern" we might have:

(is modern);(is train);(is (and modern train))

For VQA, these files should be named Questions/{train2014,val2014,...}.sps2. For GeoQA, they should be named environments/{fl,ga,...}/training.sps. Parses used in our papers are provided in extra and should be installed in the appropriate location. The VQA parser script is also located under extra/vqa; instructions for running are provided in the body of the script.

Running experiments

You will first need to create directories vis and logs (which respectively store run logs and visualization code)

Different experiments can be run by providing an appropriate configuration file on the command line (see the last line of run.sh). Examples for VQA and GeoQA are provided in the config directory.

Looking for SHAPES? I haven't finished integrating it with the rest of the codebase, but check out the shapes branch of this repository for data and code.

TODO

Configurable data location
Model checkpointing

nmn2's People

Contributors

Stargazers

Watchers

Forkers

shrinba qiangw001 ilovecv ml-ai-nlp-ir hisa0507 shwetgarg lngvietthang jwyang codeaudit wanjinchang hyzcn peratham binbinbian eriche2016 hitluobin desperado1992 icewwn gjtjx timothywangdev baiyancheng20 g-wang arnabgho ronghanghu shuaiwanggit singhranjodh cequencer gandalfvn vyraun basic-twodot0 vanpersie32 sqxiang maxmo2009 technobotz thurachel benjamesbabala lijian8 mahyarkoy marbles-ai jcjohnson isumitg ml-lab arasharchor kushalkafle soon2soon izzeddingur tianfeng80 danlg pmadhyastha malfan milestonesvn abhisg xiongshufeng zxsted richardkelley shuaiyihuang ajaytalati figmentc gopigrip7 dantodor qilewuqiong iqbal-chowdhury colinsongf dimplesl namisan solertis weiguolu shubhampachori12110095 hitesh-1997 fendaq facingwaller tonyle9 xianyong nagizeroiw weiczhu amirunpri2018 shinezai jackroos waallf-frock lucaslearns jx-cheng baylee001 nkobby miladsoftware sts-sadr jiankangren waelbou3 gomb0c bigvipclub crow-wu 5l1v3r1

nmn2's Issues

How to use extra/vqa/parse.py?

I'm trying to get this code running on a new dataset and I can't figure out the expected input format of extra/vqa/parse.py. It looks like it is trying to parse the output of the Stanford Dependency parser, but I'm having a hard time getting nontrivial parses.

I have an input file containing one question per line, including the example question from the README:

> cat questions.txt
Is the train modern?
What color is the man's swimsuit?

I can parse these questions using the Stanford Dependency parser, (throwing away stderr for clarity):

> ./lexparser.sh questions.txt 2> /dev/null
(ROOT
  (SQ (VBZ Is)
    (NP (DT the) (NN train))
    (ADJP (JJ modern))
    (. ?)))

cop(modern-4, Is-1)
det(train-3, the-2)
nsubj(modern-4, train-3)
root(ROOT-0, modern-4)

(ROOT
  (SBARQ
    (WHNP (WDT What) (NN color))
    (SQ (VBZ is)
      (NP
        (NP (DT the) (NN man) (POS 's))
        (NN swimsuit)))
    (. ?)))

det(color-2, What-1)
dobj(is-3, color-2)
root(ROOT-0, is-3)
det(man-5, the-4)
nmod:poss(swimsuit-7, man-5)
case(man-5, 's-6)
nsubj(is-3, swimsuit-7)

This looks vaguely like the expected input format that parse.py is trying to parse, but I only get trivial parses when piping this directly to parse.py:

> ./lexparser.sh questions.txt 2> /dev/null | python extra/vqa/parse.py
(_what _thing)
(_what _thing)
(_what _thing)

Am I doing something blatantly wrong? Even if you don't have functioning code for this I'd really appreciate any insight on how to use parse.py.

output json files and html file in ./vis folder are empty

Hi,

I installed the correct version of Apollocaffe. Now the code is running without any errors. But I think its not generating any output. The val_predictions_828.Json files are all empth. I have also attached the geo_nmn.txt file which has the output logs. What could be the reason?

geo_nmn.txt

Any way to save the model?

I tried saving it using SnapshotLogger, but when I tried to load it back in the constructor of NMN, it gave me an error saying that I am "loading into empty net".
I figured this is because the layers of the network is not yet defined in the constructor.
But because the layout of the network is dynamic, I am not exactly sure if the current saving/loading mechanism will work. Do you have any suggestions?

Why val and test set acc all zero?

Hi,

I am currently training from your code, and the training loss/acc seems to be good. Yet the val and test set got all zero loss and acc.

I think my paths and other things are configured as they should be. Has anyone met this issue before?

Thanks!

'Tile' is not defined

I do not understand this error. Kindly help me out. And is net.f(Tile[]) even required? what is it needed for?

`Traceback (most recent call last):
File "main.py", line 257, in <module>
main()
File "main.py", line 31, in main
  do_iter(task.train, model, config, train=True)
File "main.py", line 99, in do_iter
batch_data, model, config, train, vis)
File "main.py", line 120, in do_batch
predictions = forward(data, model, config, train, vis)
File "main.py", line 172, in forward
dropout=(train and config.opt.dropout), deterministic=not train)
File "/home/ubuntu/Caffe/nmn2/models/nmn.py", line 443, in forward
  deterministic)
File "/home/ubuntu/Caffe/nmn2/models/nmn.py", line 557, in forward_layout
net.f(Tile(tile_question, axis=1, tiles=n_layouts,bottoms=[proj_question])) 
NameError: global name 'Tile' is not defined
`

If I comment the lines in nmn.py , which have Tile, I get error below,

`Traceback (most recent call last):
 File "main.py", line 257, in <module>
  main()
 File "main.py", line 31, in main
 do_iter(task.train, model, config, train=True)
 File "main.py", line 99, in do_iter
  batch_data, model, config, train, vis)
 File "main.py", line 120, in do_batch
   predictions = forward(data, model, config, train, vis)
 File "main.py", line 172, in forward
 dropout=(train and config.opt.dropout), deterministic=not train)
 File "/home/ubuntu/Caffe/nmn2/models/nmn.py", line 443, in forward
   deterministic)
 File "/home/ubuntu/Caffe/nmn2/models/nmn.py", line 575, in forward_layout
  net.f(Eltwise(sum, "SUM", bottoms=[tile_question, concat_layer]))
  File "python/apollocaffe/cpp/_apollocaffe.pyx", line 287, in apollocaffe.cpp._apollocaffe.ApolloNet.f              (python/apollocaffe/cpp/_apollocaffe.cpp:7848)
  RuntimeError: src/caffe/apollonet.cpp(@209): Could not find bottom: 'LAYOUT_tile2_question' for   layer: LAYOUT_sum
 `

the output of question parsing

I am not sure, but it seems that the output of question parsing don't match the paper.
For example, according to the paper, "Is there a red shape above a circle?" this question should be transformed into :
measureis
or is(and(red,above(circle))

but the output of parser script is
(is shape);(is circle);(is (and shape circle))

"what is stuffed with toothbrushes wrapped in plastic?" should bu transformed into
describewhat
or what(stuff)

but the output of parser script is
(what toothbrush);(what wrap);(what (and toothbrush wrap))

It seems that only simple question such as "what color is the vase? " or "is this a clock?" can the parser script get right answer.

Is there a way to run the code for any random image with a random question?

Issues to run the code

Hi, Jacob,

Much thanks for releasing the code.

Recently, I tried to run the code. After installing apollocaffe, I implemented run.sh, then got the following errors:

Traceback (most recent call last):
File "main.py", line 11, in
import models
File "/home/jwyang/Researches/nmn2/models/init.py", line 3, in
import apollocaffe
File "/home/jwyang/Researches/apollocaffe/python/apollocaffe/init.py", line 1, in
from cpp._apollocaffe import Tensor, ApolloNet, CppConfig, make_numpy_data_param, Blob
File "/home/jwyang/Researches/apollocaffe/python/apollocaffe/cpp/init.py", line 1, in
import _apollocaffe
ImportError: No module named _apollocaffe

It is said no module apollocaffe is found. Do you have any idea about this error?

thanks,
jianwei

Update docs

The docs (Readme) need some updating with regards to installation. There are a some external dependencies not covered:

colorlogs
sexpdata

ApolloCaffe error

Hi,

I am trying to reproduce the results but I got some error like this when running run.sh on VQA dataset:

Traceback (most recent call last):
File "main.py", line 255, in
main()
File "main.py", line 31, in main
do_iter(task.train, model, config, train=True)
File "main.py", line 97, in do_iter
batch_data, model, config, train, vis)
File "main.py", line 118, in do_batch
predictions = forward(data, model, config, train, vis)
File "main.py", line 170, in forward
dropout=(train and config.opt.dropout), deterministic=not train)
File "/mnt/sdc1/shangxuan/775/nmn2/models/nmn.py", line 440, in forward
question_hidden = self.forward_question(question_data, dropout)
File "/mnt/sdc1/shangxuan/775/nmn2/models/nmn.py", line 648, in forward_question
bottoms=[word], param_names=[wordvec_param]))
File "/mnt/sdc1/shangxuan/775/apollocaffe/python/apollocaffe/layers/caffe_layers.py", line 203, in init
super(Wordvec, self).init(self, name, kwargs)
File "/mnt/sdc1/shangxuan/775/apollocaffe/python/apollocaffe/layers/layer_headers.py", line 6, in init
self.parse(sublayer, name, kwargs)
File "/mnt/sdc1/shangxuan/775/apollocaffe/python/apollocaffe/layers/layer_headers.py", line 57, in parse
raise AttributeError('Layer %s has no keyword argument %s=%s' % (param_type, k, v))
AttributeError: Layer Wordvec has no keyword argument vocab_size=3591

I am using your fork of Apollo Caffe, so I am not sure where the problem goes.

How to generate attention maps?

Hi,

I was hoping to obtain the attention maps generated by the Find Module. Currently, I'm extracting one of the sigmoid blobs from the apollo net using att_data=model.apollo_net.blobs['Find_3_sigmoid'].data[i_datum,...] after the forward pass.

Would this be the right way to get attention? I'm skeptical since the maps I've obtained do not seem correspond with the image/question.
If it is the right way, how do I choose what "Find_%d_sigmoid" to use?

Thanks a lot!

Inner product size error in image preprocessing by gist code

hi all
When i run the gist code[https://gist.github.com/jacobandreas/897987ac03f8d4b9ea4b9e44affa00e7], i got an error: F0220 13:52:50.173866 14142 inner_product_layer.cpp:61] Check failed: K_ == new_K (25088 vs. 100352) Input size incompatible with inner product parameters.

It seems that there is a size mismatch in the inner product. According to the paper, the image features are computed by the fifth convolutional layers , so i think the inner product should be removed from the prototxt[https://gist.githubusercontent.com/ksimonyan/211839e770f7b538e2d8/raw/0067c9b32f60362c74f4c445a080beed06b07eb3/VGG_ILSVRC_16_layers_deploy.prototxt]
I removed all the ops after "fc6", and the error is gone.

But i'm not sure whether it's right or wrong?

raise AttributeError('Layer %s has no keyword argument %s=%s' % (param_type, k, v))

Hi,I meet an error,,,,
2017-07-21 12:17:32,576 DEBUG [root] prepared indices
2017-07-21 12:55:22,728 DEBUG [root] computed image feature normalizers
2017-07-21 12:55:22,789 DEBUG [root] using cvpr chooser
2017-07-21 12:57:36,539 INFO [root] TRAIN2014, VAL2014:
2017-07-21 12:57:36,540 INFO [root] 369861 items
2017-07-21 12:57:36,540 INFO [root] 2002 answers
2017-07-21 12:57:36,541 INFO [root] 877 predicates
2017-07-21 12:57:36,541 INFO [root] 3591 words
2017-07-21 12:57:36,541 INFO [root]
2017-07-21 12:57:56,250 INFO [root] TEST-DEV2015:
2017-07-21 12:57:56,250 INFO [root] 60864 items
2017-07-21 12:57:56,250 INFO [root] 2002 answers
2017-07-21 12:57:56,251 INFO [root] 877 predicates
2017-07-21 12:57:56,251 INFO [root] 3591 words
2017-07-21 12:57:56,251 INFO [root]
2017-07-21 12:59:02,813 INFO [root] TEST2015:
2017-07-21 12:59:02,813 INFO [root] 244302 items
2017-07-21 12:59:02,814 INFO [root] 2002 answers
2017-07-21 12:59:02,814 INFO [root] 877 predicates
2017-07-21 12:59:02,814 INFO [root] 3591 words
2017-07-21 12:59:02,814 INFO [root]
Traceback (most recent call last):
File "main.py", line 255, in
main()
File "main.py", line 31, in main
do_iter(task.train, model, config, train=True)
File "main.py", line 97, in do_iter
batch_data, model, config, train, vis)
File "main.py", line 118, in do_batch
predictions = forward(data, model, config, train, vis)
File "main.py", line 170, in forward
dropout=(train and config.opt.dropout), deterministic=not train)
File "/disk3/hbliu/vqa/nmn2/models/nmn.py", line 440, in forward
question_hidden = self.forward_question(question_data, dropout)
File "/disk3/hbliu/vqa/nmn2/models/nmn.py", line 648, in forward_question
bottoms=[word], param_names=[wordvec_param]))
File "/home/hbliu/disk3/vqa/nmn2/apollocaffe/python/apollocaffe/layers/caffe_layers.py", line 203, in init
super(Wordvec, self).init(self, name, kwargs)
File "/home/hbliu/disk3/vqa/nmn2/apollocaffe/python/apollocaffe/layers/layer_headers.py", line 6, in init
self.parse(sublayer, name, kwargs)
File "/home/hbliu/disk3/vqa/nmn2/apollocaffe/python/apollocaffe/layers/layer_headers.py", line 57, in parse
raise AttributeError('Layer %s has no keyword argument %s=%s' % (param_type, k, v))
AttributeError: Layer Wordvec has no keyword argument vocab_size=3591

Training.sps file missing from the data set

Training.sps file missing from the data set, while running the example code for the geographical questions. Below is the error message I get.

"IOError: [Errno 2] No such file or directory: 'data/geo/environments/fl/training.sps"

Can you tell me how to generate the file and what does it contain?

How to generate Questions/{train2014,val2014,...}.sps2

Hi, how to generate the following files: Questions/{train2014,val2014,...}.sps2

Should we use both VQA v0.1 and v0.9 questions?

How to run geo_nmn with CPU-only Caffe?

Hi Jacob,

I executed the following command in terminal: ./run.sh
run.sh file contains python main.py -c config/geo_nmn.yml.
I got the folloing output.
2017-06-22 12:19:02 - GPU device 0 F0622 12:19:02.271721 12462 common.cpp:55] Cannot use GPU in CPU-only Caffe: check mode. *** Check failure stack trace: *** @ 0x7ffa2885a48a google::LogMessage::Fail() @ 0x7ffa2885a3ce google::LogMessage::SendToLog() @ 0x7ffa28859da0 google::LogMessage::Flush() @ 0x7ffa2885d121 google::LogMessageFatal::~LogMessageFatal() @ 0x7ffa2793ab60 caffe::Caffe::SetDevice() @ 0x7ffa28ac2b0d __pyx_pw_11apollocaffe_3cpp_12_apollocaffe_9CppConfig_5set_device() @ 0x4c468a (unknown) @ 0x4c2765 (unknown) @ 0x4ca099 (unknown) @ 0x4c2765 (unknown) @ 0x4c2509 (unknown) @ 0x4c061b (unknown) @ 0x4bd6ee (unknown) @ 0x4be9e7 (unknown) @ 0x4af215 (unknown) @ 0x4b0f78 (unknown) @ 0x4b0cb3 (unknown) @ 0x4ce5d0 (unknown) @ 0x4c6ed6 (unknown) @ 0x4c2765 (unknown) @ 0x4c2509 (unknown) @ 0x4f1def (unknown) @ 0x4ec652 (unknown) @ 0x4eae31 (unknown) @ 0x49e14a (unknown) @ 0x7ffa35d75830 (unknown) @ 0x49d9d9 (unknown) ./run.sh: line 8: 12462 Aborted (core dumped) python main.py -c config/geo_nmn.yml

Could you help me to resolve the issue?

how to generates the content of vqa/conv ?

--is this a script?
while conv contains the result of preprocessing these images with a 16-layer VGGNet as described in the paper. Every file in the conv directory should be of the form COCO_{SETNAME}_{IMAGEID}.jpg.npz, and contain a 512x14x14 image map in zipped numpy format.

cannot use net.f(Tile(……))

VQA Images

Hi,

I am trying to set up your tool for VQA. Is it possible to point to pre computed images in specific format you mentioned in README?

Thanks

Unknown precedence for type advmod

The presence of advmod in the input throws an error because there is no entry for it in the code to determine precedence for it.

How to visualize the attention map

I am attempting to visualize results, which is mostly handled by main.visualize(). However, the code to get the attention map has been commented out, and replaced with np.zeros.

My general question is what is the intuition behind the commented out code? Some specifics:

What is i_datum?
What is mod_layout_choice?
Why is att_blob_name created the way it is?

This will be helpful to understand, as we are also attempting to connect an additional model to the final attention map, pre softmax activation.
Thanks.

jacobandreas / nmn2 Goto Github PK

nmn2's Introduction

Neural module networks

Installing dependencies

Downloading data

VQA

GeoQA

Parsing questions

Running experiments

TODO

nmn2's People

Contributors

Stargazers

Watchers

Forkers

nmn2's Issues

Recommend Projects

Recommend Topics

Recommend Org