beijixiong3510 / owm Goto Github PK

View Code? Open in Web Editor NEW

172.0 7.0 42.0 172 KB

Code for Continual Learning of Context-dependent Processing in Neural Networks

Python 100.00%

owm's Introduction

OWM (Orthogonal Weights Modification)

Code for paper Continual Learning of Context-dependent Processing in Neural Networks. You can also get the free version from https://rdcu.be/bOaa3

There is a new version based on TF2 https://github.com/xuejianyong/OWM-tf2 provided by Dr. Jianyong Xue, and it basically reproduces our work. Thanks to all researchers who care about and support our work!

Requirements

If the environment configurations are different, the results may vary greatly or even failing to work properly. You may need to adjust the code details according to your needs.

Linux: Ubuntu 16.04
cuda9.0 & cudnn6.0
Python 3.5.4
torch 0.3.0 (pytorch)
tensorflow 1.5.0
tensorflow-gpu 1.4.0
torchvision 0.2.0
numpy 1.15.1
scipy 1.0.0

Acknowledgement

owm's People

Contributors

Stargazers

Watchers

owm's Issues

no checkpoint found at './model_best.pth.tar'

When I run celebA_main_torch.py, I can not generate a checkpoint file.why?

CUDA out of memory

The 'train_epoch' in OWM_CNN/owm.py always fills up the GPU's memory.
It seems that the memory space for each batch training has been accumulating.

What is the difference between OWM and GEM algorithm?

Hi, I have two questions about your work:

What is the difference between OWM and GEM algorithm? GEM projects the gradient on the subspace of gradients from old samples. Is the subspace of inputs x better for solving continual learning problem?
Why should we update p before projecting the gradient?
p.sub_(torch.mm(k, torch.t(k)) / (alpha + torch.mm(r, k)))
w.grad.data = torch.mm(w.grad.data, torch.t(p.data))
This means the gradient is orthogonal to the subspace of new task and will affect the performance of on the new task.

Thank you.

[1] Lopez-Paz, David. "Gradient episodic memory for continual learning." NIPS. 2017.

RuntimeError: CUDA out of memory

RuntimeError: CUDA out of memory.

Hello, may I ask what is the memory size of your GPU?

I'm running the OWM code with 8G GPU and want to fix this(RuntimeError: CUDA out of memory.).

How can I modify this program to ensure that the experimental results are consistent with yours?

thank you

How to get the chwdata_mat files about training

I have trained the datasets about CASIA_HWDB, but I can`t run the CHW_OWM.py. I found my "test_each chwdata_mat" and "train_each chwdata_mat" is wrong.
It seems the code about followed have the error.

ss = np.arange(train_length)
        np.random.shuffle(ss)
        trainimages = trainimages[ss, :]
        trainlabels = trainlabels[ss]

Traceback (most recent call last):
  File "/media/hdd/yike/OWM/CASIA_HWDB/CHW_OWM.py", line 60, in <module>
    trainlabels = trainlabels[ss]
IndexError: index 289 is out of bounds for axis 0 with size 1

So how I get the data?

About OWM_CNN, using Pytorch0.4 will out of memory, how to modify？

The code of OWM_CNN can only use the pytorch0.3 version, and the 0.4 version and above will run out of memory. Would you like to ask if there is any code compatible with version 0.4 and above?

Chinese character recognition task

作者，您好，我想复现汉字识别的实现。但是我发现我的现存总是爆了，请问您做实验时候的显存是多大呀？self.w2.data -= lr_list[0] * dw2这里就爆了

Supplementary material not found

Greetings,

I had read your paper but was unable to find any supplementary material along with it. I had used this link : https://arxiv.org/pdf/1810.01256.pdf

What is the ”context“ and how is it generated

Thank you for your outstanding work！I have doubts about the CDP module.What is the context and how is it generated ? I found "wordvet.mat" in the source code and I think it is related to context but how is it generated ？I would appreciate it if you could answer.

blog "Tensorflow and Chinese Handwritten Chinese Character Recognition" not found

Hi,
The website "http://python.jobbole.com/87509/" is unreachable. So I can`t handle CASIA_HWDB dataset. Can you give me the solution.

force_learn() got an unexpected keyword argument 'alpha'

Running OWM/Shuffled MNIST/run_shuffled_100_mnist_3Layers_2000.py

It occurs the error.

force_layer1.force_learn(w1, input, learning_rate, alpha=alpha_array[0])
TypeError: force_learn() got an unexpected keyword argument 'alpha'

ImportError: No module named 'OWM'

When running "OWM/CASIA_HWDB/CHW_ResNet18/CHW_3755_all.py", it occurs the following error:

Traceback (most recent call last):
  File "/media/hdd/yike/OWM/CASIA_HWDB/CHW_ResNet18/CHW_3755_all.py", line 17, in <module>
    from OWM.CHW.CHW_96_18_Norma.myresnet import *                                     ImportError: No module named 'OWM'

Tensorflow and Chinese Handwritten Chinese Character Recognition链接挂掉了！

请问有解决方法吗？

Why is the projection factor P mathematically possible?

Dear author, hello, I have read through your thesis, the idea is very good and the method is very effective. Because of my own limited level, I can't understand why the construction of the projection factor p is mathematically feasible. If you can, please Give a mathematically detailed proof, thank you!

ConnectionResetError: [Errno 104] Connection reset by peer

Running the "/media/hdd/yike/OWM/Disjoint MNIST/OWM/run_dis_mnist_2Layers.py".
It occurs the following errors:

Traceback (most recent call last):
  File "/media/hdd/yike/OWM/Disjoint MNIST/OWM/run_dis_mnist_2Layers.py", line 27, in <module>
    mnist = input_data.read_data_sets("./data/MNIST_data/", one_hot=True)
  File "/home/gzz/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py", line 240, in read_data_sets
    source_url + TRAIN_IMAGES)
  File "/home/gzz/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py", line 208, in maybe_download
    temp_file_name, _ = urlretrieve_with_retry(source_url)
  File "/home/gzz/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py", line 165, in wrapped_fn
    return fn(*args, **kwargs)
  File "/home/gzz/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py", line 190, in urlretrieve_with_retry
    return urllib.request.urlretrieve(url, filename)
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/urllib/request.py", line 188, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/urllib/request.py", line 163, in urlopen
    return opener.open(url, data, timeout)
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/urllib/request.py", line 466, in open
    response = self._open(req, data)
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/urllib/request.py", line 484, in _open
    '_open', req)
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/urllib/request.py", line 1297, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/urllib/request.py", line 1257, in do_open
    r = h.getresponse()
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/http/client.py", line 1198, in getresponse
    response.begin()
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/http/client.py", line 258, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/socket.py", line 576, in readinto
    return self._sock.recv_into(b)
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/ssl.py", line 937, in recv_into
    return self.read(nbytes, buffer)
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/ssl.py", line 799, in read
    return self._sslobj.read(len, buffer)
  File "/media/hdd/yike/anaconda3/envs/own/lib/python3.5/ssl.py", line 583, in read
    v = self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer

I can not get the same result

I ran the OWM_ CNN code, but the running results is not similar with res.txt . After each task training, its test accuracy in other tasks is zero. The parameters are all default parameters. Is it that I did not run correctly?

about stride！

In the file of OWM_CNN, your convolution layer uses 2*2, stride=1. but line 149, 152 and 155 in owm.py uses the stride=2. Is there anything wrong here? How should I think about that? Thank you!

How to generate the .mat file about training dataset and val dataset in CHW_OWM.py?

You give the dataset download url with HWDB1 in Baidu cloud disk. But it only has the images, I want to get the mat files so that it can directly run on your code with CASIA_HWDB. Can you give me the downloading URL.

About the imagenet experiment

Is there a plan to release the code for reproducing the imagenet experiments?

about the feature extractor

Thanks for your great work. I have some doubts about owm.
1.In the setting of jointly training the feature extractor and the classifier, the architecture of the feature extractor is very simple. If we replace the simple feature extractor with resnet18 or vgg, will OWM still work?
2. Why the bias of the conv weight is all set to zero? If we use resnet18 as the feature extractor, should we set bias of each conv to zero?