Git Product home page Git Product logo

davidmascharka / tbd-nets Goto Github PK

View Code? Open in Web Editor NEW
348.0 15.0 74.0 22.37 MB

PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

Home Page: https://arxiv.org/abs/1803.05268

License: MIT License

Jupyter Notebook 94.58% Python 5.42%
machine-learning pytorch visualization deep-learning visual-question-answering vqa neural-networks

tbd-nets's People

Contributors

arjunmajum avatar davidmascharka avatar rsokl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tbd-nets's Issues

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Trying to reproduce the experiments on train-model.ipynb and using the proposed enviroment with pytorch 0.4.1 the code produced the following error:

RuntimeError                              Traceback (most recent call last)
<ipython-input-14-82ec354902a5> in <module>()
      6     epoch += 1
      7     print('starting epoch', epoch)
----> 8     train_epoch()
      9 
     10 save_checkpoint(epoch, 'example-{:02d}.pt'.format(epoch))

<ipython-input-13-2216c33e0bef> in train_epoch()
     33 
     34         loss_file.write('Loss: {}\n'.format(loss.item()))
---> 35         loss.backward()
     36         optimizer.step()
     37         break

~/anaconda2/envs/tbd-env/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
     91                 products. Defaults to ``False``.
     92         """
---> 93         torch.autograd.backward(self, gradient, retain_graph, create_graph)
     94 
     95     def register_hook(self, hook):

~/anaconda2/envs/tbd-env/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
     88     Variable._execution_engine.run_backward(
     89         tensors, grad_tensors, retain_graph, create_graph,
---> 90         allow_unreachable=True)  # allow_unreachable flag
     91 
     92 

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Pytorch is trying to backpropagate through a tensor with no grad_fn, but I wasn't able to find the problem yet.

No longer runs on mybinder.org

Hey,

I was wondering if you tried running this lately and had an ideas as to why it doesn't run successfully anymore on mybinder.org. I don't really know much about the code in the repo nor pytorch. From looking at the environment.yml and the errors I get my guess would be that there is now a newer version of pytorch that changed conventions or some such?

I've used this repository before as an example in talks about Binder and wanted to do so again but during my run through I noticed that it doesn't work anymore. If you don't have time to fix this that is totally fine, I'll find a different repo for demo purposes.

evaluate error on val

Hello, when I evaluate on val datasets, the error appears, so what's wrong?

Traceback (most recent call last):
  File "eval.py", line 17, in <module>
    dest_dir=Path('/data'), batch_size=128)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 256, in generate_programs
    programs_pred = program_generator.reinforce_sample(questions_var)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 121, in reinforce_sample
    encoded = self.encoder(x)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 91, in encoder
    embed = self.encoder_embed(x)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/sparse.py", line 103, in forward
    self.scale_grad_by_freq, self.sparse
RuntimeError: save_for_backward can only save input or output tensors, but argument 0 doesn't satisfy this condition

and this is the place I changed in test-eval:

vocab_path = Path('data/vocab.json')
model_path = Path('models/clevr-reg-hres.pt')
tbd_net = load_tbd_net(model_path, load_vocab(vocab_path))

program_generator = load_program_generator(Path('models/program_generator.pt'))
generate_programs(Path('data/val_questions.h5'), program_generator, 
                  dest_dir=Path('/data'), batch_size=128)
                  
use_np_features = False
if use_np_features:
    features = np.load(str(Path('data/test/test_features.npy')), mmap_mode='r')
else:
    features = h5py.File(Path('data/val_features.h5'))['features']

question_np = np.load(Path('data/val_questions.npy'))
image_idx_np = np.load(Path('data/val_image_idxs.npy'))
programs_np = np.load(Path('data/val_programs.npy'))

can not find file scripts/extract_features.py

Excuse me,Thanks for your great work.when I run this code ,It have a little question.

"python scripts/extract_features.py
--input_image_dir </path/to/CLEVR/images/train>
--output_h5_file </path/to/train_features.h5>
--model_stage 2"

I can not find file scripts/extract_features.py
could you help me?

tensor matches error

My eval.py file copies from test-eval.ipynb

import torch

from pathlib import Path
import numpy as np
import h5py

from tbd.module_net import load_tbd_net
from utils.clevr import load_vocab
from utils.generate_programs import load_program_generator, generate_programs


vocab_path = Path('data/vocab.json')
model_path = Path('models/clevr-reg-hres.pt')
tbd_net = load_tbd_net(model_path, load_vocab(vocab_path))


program_generator = load_program_generator(Path('models/program_generator.pt'))
generate_programs(Path('data/val_questions.h5'), program_generator, 
                  dest_dir=Path('data/val/'), batch_size=128)


use_np_features = False
if use_np_features:
    features = np.load(str(Path('data/val/val_features.npy')), mmap_mode='r')
else:
    features = h5py.File(Path('data/val_features.h5'))['features']

question_np = np.load(Path('data/val/questions.npy'))
image_idx_np = np.load(Path('data/val/image_idxs.npy'))
programs_np = np.load(Path('data/val/programs.npy'))


answers = ['blue', 'brown', 'cyan', 'gray', 'green', 'purple', 'red', 'yellow',
           'cube', 'cylinder', 'sphere',
           'large', 'small',
           'metal', 'rubber',
           'no', 'yes',
           '0', '1', '10', '2', '3', '4', '5', '6', '7', '8', '9']

pred_idx_to_token = dict(zip(range(len(answers)), answers))


f = open('predicted_answers.txt', 'w')
def write_preds(preds):
    for pred in preds:
        f.write(pred)
        f.write('\n')



device = 'cuda' if torch.cuda.is_available() else 'cpu'



batch_size = 128
for batch in range(0, len(programs_np), batch_size):
    image_idx = image_idx_np[batch:batch+batch_size]
    programs = torch.LongTensor(programs_np[batch:batch+batch_size]).to(device)
    
    if use_np_features:
        feats = torch.FloatTensor(np.asarray(features[image_idx])).to(device)
    else:
        # Using HDF5 files requires some overhead due to constraints on how those may
        # be accessed. We cannot index into the file using a numpy array. We also cannot 
        # access the same element multiple times (e.g. we cannot index into an h5py.File 
        # with [1,1,1]) because we are constrained to increasing sequences
        feats = []
        for idx in image_idx:
            feats.append(np.asarray(features[idx]))
        feats = torch.FloatTensor(np.asarray(feats)).to(device)

    outputs = tbd_net(feats, programs)
    _, preds = outputs.max(1)
    preds = [pred_idx_to_token[pred] for pred in preds.detach().to('cpu').numpy()]
    write_preds(preds)
f.close()

and error as

Traceback (most recent call last):
  File "eval.py", line 72, in <module>
    outputs = tbd_net(feats, programs)
  File "/home/dengwei/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dengwei/tbd-nets/tbd/module_net.py", line 195, in forward
    output = module(feat_input, output)
  File "/home/dengwei/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dengwei/tbd-nets/tbd/modules.py", line 92, in forward
    attended_feats = torch.mul(feats, attn.repeat(1, self.dim, 1, 1))
RuntimeError: The size of tensor a (128) must match the size of tensor b (16384) at non-singleton dimension 1

maybe I should use NUMPY file rather HDF5 file?I extract feature from this master.

The environment setting

I find that I am stuck with the environment settings. My system is Ubuntu 16.04 ,NVIDIA driver 384.111 cuda9.1 and GTX 1080 ti.
But the error with step 2 is "Cuda runtime error(25):CUDA driver version is insufficient for CUDA runtime version".
With the NVIDIA driver up to 387.26 or 390.42, Ubuntu cannot identity the NVIDIA driver.
Nevertheless with CUDA version down to 8, the other ImportError libcudart.so.9.1: cannot open shared object files.
So may I ask what's the environment setting appropriate for the recreation?

Version bumps

I am able to run our demo on PyTorch 1.1, and I am sure we are Python 3.7 compatible. Can we bump our documented versions?

About converting HDF5 to npy

Hi, what is the advantage of converting HDF5 to npy first? Will it accelerate the training speed or accuracy?

Efficiency question about the model

Hey, I didn't run the code yet. But I noticed the code module_net.py process questions in a batch one by one, the batch only share the same stem and classifier module. Although this design is quite reasonable since different questions need different modules, I still worry about the efficiency of the training pharse. What's your setup while training (GPU number, batchsize, training time, etc..)? Do you have some advices on accelerating this? Thanks!

How to evaluate test results?

Hi,
After getting predicted answers for test data, how can I evaluate results?
Since there are different setting in your paper (e.g., Count, Compare, Exist, and so on), could u have code snippet to conveniently achieve this?
Thanks

Properties not specified in modules?

Hey, I've read the codes for different modules. It seems that the modules does not contain any design for encoding properties (e.g. red or blue for color property). Take attention module for example, if we're not sure what color we are attending, how can the module attend to the right locations? Please correct me if I missed something, thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.