Git Product home page Git Product logo

xlearn's Introduction

Transfer Learning Library

This is the transfer learning library for the following paper:

Learning Transferable Features with Deep Adaptation Networks

Unsupervised Domain Adaptation with Residual Transfer Networks

Deep Transfer Learning with Joint Adaptation Networks

The tensorflow versions are under developing.

Citation

If you use this code for your research, please consider citing:

    @inproceedings{DBLP:conf/icml/LongC0J15,
      author    = {Mingsheng Long and
                   Yue Cao and
                   Jianmin Wang and
                   Michael I. Jordan},
      title     = {Learning Transferable Features with Deep Adaptation Networks},
      booktitle = {Proceedings of the 32nd International Conference on Machine Learning,
                   {ICML} 2015, Lille, France, 6-11 July 2015},
      pages     = {97--105},
      year      = {2015},
      crossref  = {DBLP:conf/icml/2015},
      url       = {http://jmlr.org/proceedings/papers/v37/long15.html},
      timestamp = {Tue, 12 Jul 2016 21:51:15 +0200},
      biburl    = {http://dblp2.uni-trier.de/rec/bib/conf/icml/LongC0J15},
      bibsource = {dblp computer science bibliography, http://dblp.org}
    }
    
    @inproceedings{DBLP:conf/nips/LongZ0J16,
      author    = {Mingsheng Long and
                   Han Zhu and
                   Jianmin Wang and
                   Michael I. Jordan},
      title     = {Unsupervised Domain Adaptation with Residual Transfer Networks},
      booktitle = {Advances in Neural Information Processing Systems 29: Annual Conference
                   on Neural Information Processing Systems 2016, December 5-10, 2016,
                   Barcelona, Spain},
      pages     = {136--144},
      year      = {2016},
      crossref  = {DBLP:conf/nips/2016},
      url       = {http://papers.nips.cc/paper/6110-unsupervised-domain-adaptation-with-residual-transfer-networks},
      timestamp = {Fri, 16 Dec 2016 19:45:58 +0100},
      biburl    = {http://dblp.uni-trier.de/rec/bib/conf/nips/LongZ0J16},
      bibsource = {dblp computer science bibliography, http://dblp.org}
    }
    
    @inproceedings{DBLP:conf/icml/LongZ0J17,
      author    = {Mingsheng Long and
                   Han Zhu and
                   Jianmin Wang and
                   Michael I. Jordan},
      title     = {Deep Transfer Learning with Joint Adaptation Networks},
      booktitle = {Proceedings of the 34th International Conference on Machine Learning,
               {ICML} 2017, Sydney, NSW, Australia, 6-11 August 2017},
      pages     = {2208--2217},
      year      = {2017},
      crossref  = {DBLP:conf/icml/2017},
      url       = {http://proceedings.mlr.press/v70/long17a.html},
      timestamp = {Tue, 25 Jul 2017 17:27:57 +0200},
      biburl    = {http://dblp.uni-trier.de/rec/bib/conf/icml/LongZ0J17},
      bibsource = {dblp computer science bibliography, http://dblp.org}
    }

Contact

If you have any problem about our code, feel free to contact

or describe your problem in Issues.

xlearn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xlearn's Issues

Question about the code of the unbiased estimate of MMD/JMMD

Hi,
Thanks for your great jobs!
But I have some questions about the MMD/JMMD loss.
the unbiased estimate of JMMD in paper is as follow:
image
while the code is:

# Linear version
loss = 0
for i in range(batch_size):
    s1, s2 = i, (i+1)%batch_size
    t1, t2 = s1+batch_size, s2+batch_size
    loss += kernels[s1, s2] + kernels[t1, t2]
    loss -= kernels[s1, t2] + kernels[s2, t1]
return loss / float(batch_size)

It seems the samples are not matched. For example, there is a need for n/2 sample pairs to calculate the loss in the first term of the equation. But it uses n sample pairs in the code to calculate. why are they different?

Looking forward to your reply.

Questions about the MMD loss

You have done a great job.

But I have a question about the MMD Loss. The proposed MMD Loss in the paper are as follows, we can see that source domain data is paired with each other so as to the target domain data.
image

For exsample, if the batch size is 10 for source and target domain, there are 100 pairs in the first iterm of the loss, 100 pairs in the second iterm of the loss and 100 pairs in the third iterm of the loss.

But, in your code just as follows, there are only 10 pairs in the first iterm of the loss, 10 pairs in the second iterm of the loss and 20pairs in the third iterm of the loss.

image

So, I think this code did not match to the original loss propsosed in paper, can you explain it to me?

Thank you very much. Looking forward to your reply.

CUDNN compatibility?

I found that this repo cannot be built with CUDNN 6.0, would you mind specifying the version you using to developing the code?

.../Xlearn/caffe/include/caffe/util/cudnn.hpp(112): error: too few arguments in function call

Backward mmd loss = NaN

Hello,
I used caffe implementation. Sometimes MMD backward diff = NaN, and soon the whole network crushed.
In my inplementation, the data is sliced into to branches in fc layers, source data and target data, and both of them are input of mk-mmd loss layer. It works well in the beginning, but after some epoches, the MK-MMD loss backward diff turn into NaN and the training process has to be stopped.
Can you plz tell me why would this happen? Thank you so much!

Cross-Entropy Loss is not included in the total loss

Hi,

In the paper "Transferable Representation Learning with Deep Adaptation Networks", you use cross-entropy loss (which is corresponding to equation 8 in the paper) to minimize the uncertainty of predicting the labels of the target data.

I find the corresponding implementation of that equation which is defined as EntropyLoss() in loss.py. In the paper, the total loss is composed of three main parts: the classification loss, the mmd loss and the cross-entropy loss.

What confused me is that in train.py, you do add the mmd loss and the classification loss together, but you don't actually add the cross-entropy loss. I am wondering do I miss something or do you do it on purpose?

Looking forward to hearing from you soon.

Thank you,
Ke

How to set MMD penalty parameter automatically.

In the DAN paper, you say "we can automatically select the MMD penalty parameter(lambda) on a validation set by jointly assessing the test errors of the source classifier and the two-sample classifier", and in the code you just set the parameter=1. How to understand this?

.item

{'trade_off': 1.0, 'name': 'JAN'}
Traceback (most recent call last):
File "train.py", line 288, in
transfer_classification(config)
File "train.py", line 222, in transfer_classification
print image_classification_test(dset_loaders["target"], nn.Sequential(base_network, bottleneck_layer, classifier_layer), test_10crop=prep_dict["target"]["test_10crop"], gpu=use_gpu)
File "train.py", line 107, in image_classification_test
accuracy = torch.sum(torch.squeeze(predict).float() == all_label).item() / float(all_label.size()[0])
AttributeError: 'int' object has no attribute 'item'

when I remove the .item(), it works well, just wondering anyone meet the similar issue.

about DA usage scenario

Hi, it‘’s my first time to try domain adaptation. Here is my scenario: i am doing a image classification task, there are already 100k training data with labels (called A), i also can obtain large data with no labels (called B) . Data A and B's domain shift are small. The plan i choose now is using data A to train a model to predict B's data directly. the results is also good. when i annotating more data from B, and use them and data A together to train, the results are better. However, to reduce the work of annotating images, my question is can i treat A as source, B as target to improve accuracy further (i.e., adding more data B in training phase compared with current plan) .

When will TADA(AAAI'19) be released?

I have read the paper “Transferable Attention for Domain Adaptation” published in AAAI 2019 and interested a lot. The paper wrote the code and datasets would be available at github.com/thuuml, that is, on this homepage. However, I could not find it. I wanna know when will the source code available? I am working on a problem related to traffic forecasting, and I want to refer to the idea of attention in the TADA model. If possible, could you share the source code with me? Thanks a lot.

How can I solve this problem, Thank you!

F:\ProgramData\Anaconda3\python.exe "E:/coding/domain adaptation/Xlearn-master/pytorch/src/train.py"
{'name': 'JAN', 'trade_off': 1.0}
Traceback (most recent call last):
File "", line 1, in
File "F:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "F:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="mp_main")
File "F:\ProgramData\Anaconda3\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "F:\ProgramData\Anaconda3\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "F:\ProgramData\Anaconda3\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "E:\coding\domain adaptation\Xlearn-master\pytorch\src\train.py", line 5, in
import torch
File "F:\ProgramData\Anaconda3\lib\site-packages\torch_init
.py", line 78, in
from torch._C import *
ImportError: DLL load failed: 页面文件太小,无法完成操作。
Traceback (most recent call last):
File "E:/coding/domain adaptation/Xlearn-master/pytorch/src/train.py", line 292, in
transfer_classification(config)
File "E:/coding/domain adaptation/Xlearn-master/pytorch/src/train.py", line 224, in transfer_classification
print(image_classification_test(dset_loaders["target"], nn.Sequential(base_network, bottleneck_layer, classifier_layer), test_10crop=prep_dict["target"]["test_10crop"], gpu=use_gpu))
File "E:/coding/domain adaptation/Xlearn-master/pytorch/src/train.py", line 63, in image_classification_test
iter_test = [iter(loader['test'+str(i)]) for i in range(10)]
File "E:/coding/domain adaptation/Xlearn-master/pytorch/src/train.py", line 63, in
iter_test = [iter(loader['test'+str(i)]) for i in range(10)]
File "F:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 451, in iter
return _DataLoaderIter(self)
File "F:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 239, in init
w.start()
File "F:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

Random sampling used in MMD

In mmd layer, random sampling is used in both forward and backward computation.

https://github.com/thuml/Xlearn/blob/master/caffe/src/caffe/layers/mmd_layer.cu#L85-91
https://github.com/thuml/Xlearn/blob/master/caffe/src/caffe/layers/mmd_layer.cu#L144-150

There may be some problem if loss and gradient are computed with different samples. Maybe it is better to use a mask to store the selected samples as dropout layer does.

Besides, it is a little confused that the loss computed in forward is not used in backward computation.

When will TCL (AAAI'19) be released?

Hi,

I have read your paper "Transferable Curriculum for Weakly-Supervised Domain Adaptation", and inspired a lot. Meanwhile, I notice that you say source code will be available in this repository at the beginning of Section Experiments.

I wanna to know when will the source code available ? I cannot wait to see the beauty of this algorithm in detail :)

�Thanks !

Pytorch results

I got the results when I run. It was for Amazon to webcam transfer task. Why so many results? Are the results corresponding AlexNet or Resnet ?

$python train.py 0

{'trade_off': 1.0, 'name': 'DAN'}
0.0188679245283
0.667924528302
0.774842767296
0.79748427673
0.8
0.810062893082
0.803773584906
0.822641509434
0.811320754717
0.817610062893
0.803773584906
0.813836477987
0.815094339623
0.79748427673
0.810062893082
0.792452830189
0.813836477987
0.796226415094
0.794968553459
0.8
0.8
0.793710691824
0.79748427673
0.813836477987

jmmd_loss is always zero and accuracy results.

Hi,

I got always jmmd_loss is always zero. Could you tell me please why?

I1006 08:06:35.640272 30352 solver.cpp:407] Test net output #0: accuracy = 0.625488
I1006 08:06:35.815197 30352 solver.cpp:231] Iteration 500, loss = 0.176767
I1006 08:06:35.815225 30352 solver.cpp:247] Train net output #0: jmmd_loss = 0 (* 0.3 = 0 loss)
I1006 08:06:35.815232 30352 solver.cpp:247] Train net output #1: softmax_loss = 0.176767 (* 1 = 0.176767 loss)

And for amazon to webcam transfer task I got 69.5% accuracy. Am I doing something wrong?

(Pytorch) JAN Cannot Reproduce Results in Paper

Hello, I cannot reproduce your results for w->a using this command.

python train.py 0 (Using JAN, tradeoff:1.0)

The accuracy is only 0.60 instead of 0.70 reported in your paper for w->a.

Is there anything wrong in the code?

Using sigmoid for binary classification

I am using the current pytorch implementation for binry classification. I wanted to use a sigmoid function. Will that affect the MMD loss. though according to the paper I feel it soulhn;t.

Thank You

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.