thuml / xlearn Goto Github PK

Transfer Learning Library

CMake 1.15% Makefile 0.28% Shell 0.29% HTML 0.08% CSS 0.10% Jupyter Notebook 57.74% C++ 32.96% Python 4.37% Cuda 2.64% MATLAB 0.36% Dockerfile 0.03%

deep-learning transfer-learning

xlearn's Introduction

Xlearn (Obsolete, upgraded to https://github.com/thuml/Transfer-Learning-Library)

Transfer Learning Library

This is the transfer learning library for the following paper:

Learning Transferable Features with Deep Adaptation Networks

Unsupervised Domain Adaptation with Residual Transfer Networks

Deep Transfer Learning with Joint Adaptation Networks

The tensorflow versions are under developing.

Citation

If you use this code for your research, please consider citing:

    @inproceedings{DBLP:conf/icml/LongC0J15,
      author    = {Mingsheng Long and
                   Yue Cao and
                   Jianmin Wang and
                   Michael I. Jordan},
      title     = {Learning Transferable Features with Deep Adaptation Networks},
      booktitle = {Proceedings of the 32nd International Conference on Machine Learning,
                   {ICML} 2015, Lille, France, 6-11 July 2015},
      pages     = {97--105},
      year      = {2015},
      crossref  = {DBLP:conf/icml/2015},
      url       = {http://jmlr.org/proceedings/papers/v37/long15.html},
      timestamp = {Tue, 12 Jul 2016 21:51:15 +0200},
      biburl    = {http://dblp2.uni-trier.de/rec/bib/conf/icml/LongC0J15},
      bibsource = {dblp computer science bibliography, http://dblp.org}
    }
    
    @inproceedings{DBLP:conf/nips/LongZ0J16,
      author    = {Mingsheng Long and
                   Han Zhu and
                   Jianmin Wang and
                   Michael I. Jordan},
      title     = {Unsupervised Domain Adaptation with Residual Transfer Networks},
      booktitle = {Advances in Neural Information Processing Systems 29: Annual Conference
                   on Neural Information Processing Systems 2016, December 5-10, 2016,
                   Barcelona, Spain},
      pages     = {136--144},
      year      = {2016},
      crossref  = {DBLP:conf/nips/2016},
      url       = {http://papers.nips.cc/paper/6110-unsupervised-domain-adaptation-with-residual-transfer-networks},
      timestamp = {Fri, 16 Dec 2016 19:45:58 +0100},
      biburl    = {http://dblp.uni-trier.de/rec/bib/conf/nips/LongZ0J16},
      bibsource = {dblp computer science bibliography, http://dblp.org}
    }
    
    @inproceedings{DBLP:conf/icml/LongZ0J17,
      author    = {Mingsheng Long and
                   Han Zhu and
                   Jianmin Wang and
                   Michael I. Jordan},
      title     = {Deep Transfer Learning with Joint Adaptation Networks},
      booktitle = {Proceedings of the 34th International Conference on Machine Learning,
               {ICML} 2017, Sydney, NSW, Australia, 6-11 August 2017},
      pages     = {2208--2217},
      year      = {2017},
      crossref  = {DBLP:conf/icml/2017},
      url       = {http://proceedings.mlr.press/v70/long17a.html},
      timestamp = {Tue, 25 Jul 2017 17:27:57 +0200},
      biburl    = {http://dblp.uni-trier.de/rec/bib/conf/icml/LongZ0J17},
      bibsource = {dblp computer science bibliography, http://dblp.org}
    }

Contact

If you have any problem about our code, feel free to contact

or describe your problem in Issues.

xlearn's People

Contributors

Stargazers

Watchers

Forkers

go2star redhat12345 guicunbin mahfuj9346449 zhangweichen2006 deep0learning wj-zhang xcpeng dpineo yanliang0813 tiankong12 yuzeng2333 sunyuegoahead pinglmlcv baucheng abdaladiasse wenhuach sunset0864 dldd03 chenchen-leo osgoodwu xugithub1 dawin2015 youkaichao yydxlv kr11 minglonglei gsygsy96 mahfujau douhaoexia vmmanju shaoyuhlq buaaduke zhiwenshao leivo simon717 wenjunjiang chriszhenghaochen jimchenhub lbnphoenix wudy14 xuhuiwen33 susuxu zzzz94 marsrocky zju-plp tandychao compass-wang zunzhumu lihao100106 rigel-1994 mid-push wwensun junhocho joefannie cxmscb tongzhecmee jyhjana caozhangjie xavierxhq sunting78 yaoyi626 helenligit lan1991xu tgiser juzigithub jiaojinyang styxmshy colinwke huangpu1 wxyhv liuheng0111 lemingguo caoyao11 fengshuanglang cch2016 zhangzhao156 vienvien6 dengwanxia1991 yangqun1 wjx2 ruijia-xu edmig catyans zchaizju pierrehao gsx0 tianyouchen jingang-cv chep1126 wjj5881005 somone23412 hanhanlixianji gufeicang guoleming yuanmengzhixing likenxumi 2017210698 zhangjingsecond jingzbu

xlearn's Issues

Random sampling used in MMD

In mmd layer, random sampling is used in both forward and backward computation.

https://github.com/thuml/Xlearn/blob/master/caffe/src/caffe/layers/mmd_layer.cu#L85-91
https://github.com/thuml/Xlearn/blob/master/caffe/src/caffe/layers/mmd_layer.cu#L144-150

There may be some problem if loss and gradient are computed with different samples. Maybe it is better to use a mask to store the selected samples as dropout layer does.

Besides, it is a little confused that the loss computed in forward is not used in backward computation.

How to set MMD penalty parameter automatically.

In the DAN paper, you say "we can automatically select the MMD penalty parameter(lambda) on a validation set by jointly assessing the test errors of the source classifier and the two-sample classifier", and in the code you just set the parameter=1. How to understand this?

Using sigmoid for binary classification

I am using the current pytorch implementation for binry classification. I wanted to use a sigmoid function. Will that affect the MMD loss. though according to the paper I feel it soulhn;t.

Thank You

CUDNN compatibility?

I found that this repo cannot be built with CUDNN 6.0, would you mind specifying the version you using to developing the code?

.../Xlearn/caffe/include/caffe/util/cudnn.hpp(112): error: too few arguments in function call

When will TCL (AAAI'19) be released?

Hi,

I have read your paper "Transferable Curriculum for Weakly-Supervised Domain Adaptation", and inspired a lot. Meanwhile, I notice that you say source code will be available in this repository at the beginning of Section Experiments.

I wanna to know when will the source code available ? I cannot wait to see the beauty of this algorithm in detail :)

�Thanks !

Looking for the code of 'Conditional Adversarial Domain Adaptation'

When is the implementation of the Pytorch version available?
It is an exciting job!
Thank you!

the fc7_jmmd_loss and fc8_jmmd_loss are always zeros during the training

I have run JAN with Resnet50 on my own dataset, however the jmmd_losses are always zeros, I am very appreciate it if anyone can tell me the reason, thanks !

Cross-Entropy Loss is not included in the total loss

Hi,

In the paper "Transferable Representation Learning with Deep Adaptation Networks", you use cross-entropy loss (which is corresponding to equation 8 in the paper) to minimize the uncertainty of predicting the labels of the target data.

I find the corresponding implementation of that equation which is defined as EntropyLoss() in loss.py. In the paper, the total loss is composed of three main parts: the classification loss, the mmd loss and the cross-entropy loss.

What confused me is that in train.py, you do add the mmd loss and the classification loss together, but you don't actually add the cross-entropy loss. I am wondering do I miss something or do you do it on purpose?

Looking forward to hearing from you soon.

Thank you,
Ke

about DA usage scenario

Hi, it‘’s my first time to try domain adaptation. Here is my scenario: i am doing a image classification task, there are already 100k training data with labels (called A), i also can obtain large data with no labels (called B) . Data A and B's domain shift are small. The plan i choose now is using data A to train a model to predict B's data directly. the results is also good. when i annotating more data from B, and use them and data A together to train, the results are better. However, to reduce the work of annotating images, my question is can i treat A as source, B as target to improve accuracy further (i.e., adding more data B in training phase compared with current plan) .

.item

{'trade_off': 1.0, 'name': 'JAN'}
Traceback (most recent call last):
File "train.py", line 288, in
transfer_classification(config)
File "train.py", line 222, in transfer_classification
print image_classification_test(dset_loaders["target"], nn.Sequential(base_network, bottleneck_layer, classifier_layer), test_10crop=prep_dict["target"]["test_10crop"], gpu=use_gpu)
File "train.py", line 107, in image_classification_test
accuracy = torch.sum(torch.squeeze(predict).float() == all_label).item() / float(all_label.size()[0])
AttributeError: 'int' object has no attribute 'item'

when I remove the .item(), it works well, just wondering anyone meet the similar issue.

How can I solve this problem, Thank you!

F:\ProgramData\Anaconda3\python.exe "E:/coding/domain adaptation/Xlearn-master/pytorch/src/train.py"
{'name': 'JAN', 'trade_off': 1.0}
Traceback (most recent call last):
File "", line 1, in
File "F:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "F:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="mp_main")
File "F:\ProgramData\Anaconda3\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "F:\ProgramData\Anaconda3\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "F:\ProgramData\Anaconda3\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "E:\coding\domain adaptation\Xlearn-master\pytorch\src\train.py", line 5, in
import torch
File "F:\ProgramData\Anaconda3\lib\site-packages\torch_init.py", line 78, in
from torch._C import *
ImportError: DLL load failed: 页面文件太小，无法完成操作。
Traceback (most recent call last):
File "E:/coding/domain adaptation/Xlearn-master/pytorch/src/train.py", line 292, in
transfer_classification(config)
File "E:/coding/domain adaptation/Xlearn-master/pytorch/src/train.py", line 224, in transfer_classification
print(image_classification_test(dset_loaders["target"], nn.Sequential(base_network, bottleneck_layer, classifier_layer), test_10crop=prep_dict["target"]["test_10crop"], gpu=use_gpu))
File "E:/coding/domain adaptation/Xlearn-master/pytorch/src/train.py", line 63, in image_classification_test
iter_test = [iter(loader['test'+str(i)]) for i in range(10)]
File "E:/coding/domain adaptation/Xlearn-master/pytorch/src/train.py", line 63, in
iter_test = [iter(loader['test'+str(i)]) for i in range(10)]
File "F:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 451, in iter
return _DataLoaderIter(self)
File "F:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 239, in init
w.start()
File "F:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "F:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

Looking for the code of 'Multi-Adversarial Domain Adaptation'

Hi there, it is said in the AAAI-18 paper 'Multi-Adversarial Domain Adaptation' that the code is available on this repo (github.com/thuml). I found that this repo is the closest one. But there is no code.

run pytorch code, Why has accuracy been zero?

I found that inputs = [data[j][0] for j in range(10)]
labels = data[0][1]. labels is a a constant value. How should it be modified?

When the tensorflow version can be avaliable?

I wanna know the approximate time the tensorflow version can be avaliable. I really need it to complete my experience. Thanks so much.

Question about the code of the unbiased estimate of MMD/JMMD

Hi,
Thanks for your great jobs!
But I have some questions about the MMD/JMMD loss.
the unbiased estimate of JMMD in paper is as follow:

while the code is:

# Linear version
loss = 0
for i in range(batch_size):
    s1, s2 = i, (i+1)%batch_size
    t1, t2 = s1+batch_size, s2+batch_size
    loss += kernels[s1, s2] + kernels[t1, t2]
    loss -= kernels[s1, t2] + kernels[s2, t1]
return loss / float(batch_size)

It seems the samples are not matched. For example, there is a need for n/2 sample pairs to calculate the loss in the first term of the equation. But it uses n sample pairs in the code to calculate. why are they different?

Looking forward to your reply.

Questions about the MMD loss

You have done a great job.

But I have a question about the MMD Loss. The proposed MMD Loss in the paper are as follows, we can see that source domain data is paired with each other so as to the target domain data.

For exsample, if the batch size is 10 for source and target domain, there are 100 pairs in the first iterm of the loss, 100 pairs in the second iterm of the loss and 100 pairs in the third iterm of the loss.

But, in your code just as follows, there are only 10 pairs in the first iterm of the loss, 10 pairs in the second iterm of the loss and 20pairs in the third iterm of the loss.

So, I think this code did not match to the original loss propsosed in paper, can you explain it to me?

Thank you very much. Looking forward to your reply.

Backward mmd loss = NaN

Hello,
I used caffe implementation. Sometimes MMD backward diff = NaN, and soon the whole network crushed.
In my inplementation, the data is sliced into to branches in fc layers, source data and target data, and both of them are input of mk-mmd loss layer. It works well in the beginning, but after some epoches, the MK-MMD loss backward diff turn into NaN and the training process has to be stopped.
Can you plz tell me why would this happen? Thank you so much!

the loss of DaN and JaN appear negative (The Pytorch versions)

After I ran the train.py, I found loss negative, is this a bug in the program?

In " loss.py " about ,"DAN loss ",it use MK-MMD? why i think it only use MMD In a certain layer. who can tell me?

In " loss.py " about ,"DAN loss ",it use MK-MMD? why i think it only use MMD In a certain layer.
who can tell me?

jmmd_loss is always zero and accuracy results.

Hi,

I got always jmmd_loss is always zero. Could you tell me please why?

I1006 08:06:35.640272 30352 solver.cpp:407] Test net output #0: accuracy = 0.625488
I1006 08:06:35.815197 30352 solver.cpp:231] Iteration 500, loss = 0.176767
I1006 08:06:35.815225 30352 solver.cpp:247] Train net output #0: jmmd_loss = 0 (* 0.3 = 0 loss)
I1006 08:06:35.815232 30352 solver.cpp:247] Train net output #1: softmax_loss = 0.176767 (* 1 = 0.176767 loss)

And for amazon to webcam transfer task I got 69.5% accuracy. Am I doing something wrong?

When will there be Tensorflow and Pytorch implementations available?

I've been waiting for the Tensorflow or Pytorch implementations for many months. When will they be available?
DAN, RTN, and JAN are all outstanding work that deserves to be reproduced more!

Pytorch result always show : tensor(0, device='cuda:0')

{'name': 'JAN', 'trade_off': 1}
tensor(0, device='cuda:0')
tensor(0, device='cuda:0')
tensor(0, device='cuda:0')
tensor(0, device='cuda:0')
tensor(0, device='cuda:0')
...

Thank you for your hlep!

A list of what is going to be deployed?

Pytorch results

I got the results when I run. It was for Amazon to webcam transfer task. Why so many results? Are the results corresponding AlexNet or Resnet ?

$python train.py 0

{'trade_off': 1.0, 'name': 'DAN'}
0.0188679245283
0.667924528302
0.774842767296
0.79748427673
0.8
0.810062893082
0.803773584906
0.822641509434
0.811320754717
0.817610062893
0.803773584906
0.813836477987
0.815094339623
0.79748427673
0.810062893082
0.792452830189
0.813836477987
0.796226415094
0.794968553459
0.8
0.8
0.793710691824
0.79748427673
0.813836477987

DAN，JAN(caffe version) cannot be reproduced the results as papers reported

hi, i run the DAN and JAN (alexnet, caffe version) on the A->W task, here is my result:
Alexnet: 60%
DAN: 65%
JAN: 69%
while the results in the paper is:
DAN: 68%
JAN: 74%

is there anything wrong, thanks in advance……

(Pytorch) JAN Cannot Reproduce Results in Paper

Hello, I cannot reproduce your results for w->a using this command.

python train.py 0 (Using JAN, tradeoff:1.0)

The accuracy is only 0.60 instead of 0.70 reported in your paper for w->a.

Is there anything wrong in the code?

In the DAN Pytorch implementation, Code about Bata for different kernels?

pytorch version of DAN loss : I dont't find betals param of multi guassian_kernels

In " loss.py " about ,"DAN loss ", I dont't find bata parameter of multi -guassian_kernels and dont't known how it update? Can someone tell me? Thanks.

When will TADA(AAAI'19) be released?

I have read the paper “Transferable Attention for Domain Adaptation” published in AAAI 2019 and interested a lot. The paper wrote the code and datasets would be available at github.com/thuuml, that is, on this homepage. However, I could not find it. I wanna know when will the source code available? I am working on a problem related to traffic forecasting, and I want to refer to the idea of attention in the TADA model. If possible, could you share the source code with me? Thanks a lot.

RuntimeError: $ Torch: not enough memory: you tried to allocate 0GB. Buy new RAM! at /pytorch/torch/lib/TH/THGeneral.c:270

Runtime Error: $ Torch: not enough memory

some doubts about the code for guassian kernel

when calculate the guassian kernel, I found this line in loss.py:
L2_distance = ((total0-total1)**2).sum(2)
why here is sum(2)?