Git Product home page Git Product logo

haeusser / learning_by_association Goto Github PK

View Code? Open in Web Editor NEW
149.0 149.0 63.0 376 KB

This repository contains code for the paper Learning by Association - A versatile semi-supervised training method for neural networks (CVPR 2017) and the follow-up work Associative Domain Adaptation (ICCV 2017).

Home Page: https://vision.in.tum.de/members/haeusser

License: Apache License 2.0

Python 17.25% Shell 0.06% Jupyter Notebook 82.70%

learning_by_association's People

Contributors

haeusser avatar kchen92 avatar tfrerix avatar znah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

learning_by_association's Issues

Possible to get the parameter settings to reproduce all of the reported results in the paper?

Hi there,

First off - great paper and idea. I think it's a really creative use of neural nets for semi-supervised learning.

I'm looking into possible extensions of your work. To do so it would be good if I could pivot off the results reported in your paper as baselines. I could go through and try to figure out all the parameter settings I need to use myself. Instead though, it would be a lot easier if I already had the exact parameter settings.

Could you possibly post all of the parameter arguments that I should provide to your code to reproduce all of the results in the paper? Or at least, those coming close to or beating state of the art on SVHN (particularly 500/1k labels) and on MNIST (particularly 100 labels, permutation invariant and convolutional)? It would be really helpful.

Kind regards,
Liam Schoneveld

Paper / code parameters for reproducing experiments

Hi,
I'd like to ask few questions about code parameters, in order to reproduce exactly the experiments in the paper.
Referring to SVHN -> MNIST and SYNTH SIGNS -> GTSRB experiments:

  • Both of them use svhn_model right?

  • What is the flag term associated to Delay (steps) for L_assoc ?

  • sup_per_class is set by default at 100, but it should be put at -1 for using the whole dataset right?

  • Batch size is controlled by the use of unsup_batch_size for unlabeled target dataset, and sup_per_batch for the labeled source dataset, so for example to get 1032 target batch size for GTSRB(43 classes) this need to be set to 24?

Thanks in advance!

failed to reproduce the result on the paper.

Hello,

I am reproducing the result on the paper, actually SVHN to MNIST.
At first I tried to find "{stl10,svhn,synth}_tools.py" files or "package" flag on {train, eval}.py as wirrten in README, but I couldn't find it.
Therefore, I made and execute .sh files below for training and evaluating with hyper parameter written in the paper.

  • train.sh
python semisup/train.py \
 --dataset="svhn" \
 --target_dataset="mnist3" \
 --new_size=32 \
 --architecture="mnist_model" \
 --sup_per_class=-1 \
 --unsup_samples=-1 \
 --sup_per_batch=100 \
 --unsup_per_batch=1000 \
 --decay_steps=9000 \
 --visit_weight=0.2 \
 --walker_wieght_envelop_delay=500 \
 --max_steps=25000 \
 --logdir=./log/svhn_to_mnist/reproduce
  • eval.sh
python semisup/eval.py \
 --logdir=./log/svhn_to_mnist/reproduce \
 --dataset="mnist3" \
 --new_size=32 \
 --architecture="mnist_model"

In training, total loss and walker loss made a convergence, but the evaluation accuracy after training done showed about 0.78, not 0.976(paper accuracy).

Do you know what I am missing?

Thank you!

Reproducing STL-10 Result

Hi,

Would you mind sharing your full list of hyper parameters for producing STL-10 data set result as shown in the paper? I was far from reproducing the result after tweaking a few parameters here and there as being used in other data sets.

And one more question: in the paper it was mentioned that you used only 100 images per label. May I know whether there is a particular reason why you don't take advantage of all the labelled images available?

failed to reproduce the result on the paper.

Hi,
I'm trying to obtain results for the experiment "svhn --> mnist", as you published in your paper only the case "svhn --> mnist".

I used the hyperparameter you gave me, but I failed to reproduce the result on the paper.
After running eval.py, I only get an accuracy of about 95.66% .
In your paper, in this case 0.51 errors(%) (shown Table 5). is it means an accuracy of 99.5%?
Is there something I'm missing here?

I also refer to your response. (issues/3) .

alright so I re-ran the training myself again and everything seems fine. I uploaded for you the logs including hyper params and TFEvents so you can visualize the graph with TensorBoard: https://vision.in.tum.de/~haeusser/da_svhn_mnist.zip

The TensorFlow version was https://github.com/haeusser/tensorflow

I visuallized your log.
image

The accuracy here is 97.59%. It is different from the result of your paper.( 0.51 errors(%) (shown Table 5))

Any thoughts, or exact instructions on how to replicate any of the results from the paper, would be greatly appreciated.

Hyemin

<SVHN -> MNIST>
flags of train.py

"target_dataset": "mnist3",
"walker_weight_envelope_delay": "500",
"max_checkpoints": 5,
"new_size": 32,
"dataset": "svhn",
"sup_per_batch": 100,
"decay_steps": 9000,
"unsup_batch_size": 1000,
"sup_per_class": -1,
"walker_weight_envelope_steps": 1,
"walker_weight_envelope": "linear",
"visit_weight_envelope": "linear",
"architecture": "svhn_model",
"visit_weight": 0.2,
"max_steps": "12000"

flags of eval.py
flags.DEFINE_string('dataset', 'mnist3', 'Which dataset to work on.')

flags.DEFINE_string('architecture', 'svhn_model', 'Which dataset to work on.')

flags.DEFINE_integer('eval_batch_size', 500, 'Batch size for eval loop.')

flags.DEFINE_integer('new_size', 32, 'If > 0, resize image to this width/height.'
'Needs to match size used for training.')

flags.DEFINE_integer('emb_size', 128,
'Size of the embeddings to learn.')

flags.DEFINE_integer('eval_interval_secs', 300,
'How many seconds between executions of the eval loop.')

flags.DEFINE_string('logdir', '/storage/transfer_learning/log2/semisup',
'Where the checkpoints are stored '
'and eval events will be written to.')

flags.DEFINE_string('master', '',
'BNS name of the TensorFlow master to use.')

flags.DEFINE_integer('timeout', 1200,
'The maximum amount of time to wait between checkpoints. '
'If left as None, then the process will wait '
'indefinitely.')

what's the data augmentation used in SVHN

In the paper "learning by association", the data augmentation is used in SVHN. I tried the data augmentation method described in paper, however failed to get the reported results. Can you share the data augmentation code?

Backpropagation

How would I calculate the backpropagation through the loss layer i.e. the derivative of the loss w.r.t. the embeddings, A and B? I am trying to implement this in MatConvNet, which requires me to code the backward pass.

UnicodeDecodeError--Thank you

Hello,haeusser,I was trying to reproduce the result on the paper too, SVHN to MNIST.
But met error as "UnicodeDecodeError",
Do you know how to correct the error? Thank you very much.

Extracting /media/sward/_lyh1/datasets/mnist//train-images-idx3-ubyte.gz
Traceback (most recent call last):
File "./learning_by_association/semisup/train.py", line 386, in
app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./learning_by_association/semisup/train.py", line 208, in main
FLAGS.target_dataset_split)
File "/media/sward/_lyh1/learning_by_association/semisup/tools/mnist3.py", line 34, in get_data
images, labels = mnist.get_data(name)
File "/media/sward/_lyh1/learning_by_association/semisup/tools/mnist.py", line 43, in get_data
'/train-images-idx3-ubyte.gz'), extract_labels(
File "/media/sward/_lyh1/learning_by_association/semisup/tools/mnist.py", line 60, in extract_images
magic = _read32(bytestream)
File "/media/sward/_lyh1/learning_by_association/semisup/tools/mnist.py", line 53, in _read32
return np.frombuffer(bytestream.read(4), dtype=dt)[0]
File "/usr/lib/python3.5/gzip.py", line 274, in read
return self._buffer.read(size)
File "/usr/lib/python3.5/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 404, in _read_gzip_header
magic = self._fp.read(2)
File "/usr/lib/python3.5/gzip.py", line 91, in read
self.file.read(size-self._length+read)
File "/usr/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

Inquiry about experimental result

Thank you for providing us with the code.
I'm running the train.py corresponding Domain adaptation (SVHN->MNIST)
I only modified hyper parameter (visit_weight, walker weight = 0.5, steps = 9000)
Result of eval.py looks like :
image
Accuracy of selected architecture is 97.62%. This result is lower than 99.5% (errors(%) = 0.51, Result of paper - Table 5 ).
What's the reason? Do I have to modify parameters ( visit weight, walker weight, learning rate, steps)?

The hyper parameter settings I have run are shown below.
image

Problem: reproduce the result, SVHN to MNIST.

Hello, I was trying to reproduce the result on the paper too, SVHN to MNIST. First I download your code, chaning code 'target_dataset', None into "target_dataset', 'mnist3' in train.py, then run train.py, after 100000 traning steps finished, I run the eval.py, but I can't get the Accuracy value. Please see details of result in attachment.

result.txt

Tensorflow Version Syntax Mismatched

In backend.py line 70, the tf.concat has the new signature:

  tf.concat(batch_images, 0), tf.concat(batch_labels, 0)

In tarin.py line 301 and 309, it has the old signature:

  t_sup_emb = tf.concat(0, [
                    t_sup_emb, semisup.create_virt_emb(FLAGS.virtual_embeddings,
                                                       FLAGS.emb_size)
                ])

Is this meant for Tensorflow 1.0 or above? Could you please kindly list the dependencies in the readme file, such as Python version, Tensorflow Version and Numpy version etc?

When I ran it with Python 3.5 and Tensorflow 1.1, I ran into a problem when reading the stl-10 bin file:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 0: invalid start byte
File "/root/DL/computer-vision/learning_by_association/semisup/tools/stl10.py", line 67, in extract_images

 imgs = np.fromstring(f.read(), np.uint8)  

So I ran it with Python 2.7 and tensorflow 0.12.1 by changing the above tf.concat signature to the old version, and encounter another error:
backend.py, line 179, in add_semisup_loss

    loss_aba = tf.losses.softmax_cross_entropy(    

AttributeError: 'module' object has no attribute 'losses'

This is an indication that the backend.py is using Tensorflow version 1.0 or higher.

However in some other places I see indications of lower version being used. Your clarification or fixes will be very appreciated!

New experiment mnist - Svhn

Hi,
I'm trying to obtain results for the experiment "mnist --> svhn", as you published in your paper only the case "svhn --> mnist".
I have no problem running the experiment as it needs only to swap the training and target dataset names, but my issue is that, no matter of what hyperparameters I choose, the loss exhibits always the same behaviour: it start from a little value around 0.5, and suddenly after few iterations it jumps to 6.0 - 7.0 value,than it settle around 6.5 value until training is finished. The accuracy is 32%.
I tried several hyperparameters values, taken from all the other experiments settings. but until now I have no better success than this.
Do you have some walker / visit/ logit specific hyperparameter to suggest for this experiment?
Edit: I found that the sudden jump of loss is caused by the activation of walker_weight, controlled by the variable walker_weight_envelope_delay. No I'll try with walker_weight_envelope_delay: 2000 instead of 500, or not activating it at all.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.