I am curious about how you get the sum of the parameters of networks of each task. Did

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-h

No, in the inner loop theta for all tasks are initialized with

about adding parameters about itaml HOT 5 CLOSED

JoyHuYY1412 commented on August 24, 2024

about adding parameters

from itaml.

Comments (5)

brjathu commented on August 24, 2024

Hi, @JoyHuYY1412, just a average over all the tasks. No we havent used any normalizations.

See

iTAML/learner_task_itaml.py

Line 163 in e56e72b

ll = torch.stack(reptile_grads[i])

from itaml.

JoyHuYY1412 commented on August 24, 2024

Hi, @JoyHuYY1412, just a average over all the tasks. No we havent used any normalizations.

See

iTAML/learner_task_itaml.py

Line 163 in e56e72b

ll = torch.stack(reptile_grads[i])

Thank you for your reply.
So if different tasks have parameters of different scales, e.g, A>>B, it seems the averaged network will be biased toward A. So when we try to recover the network for task B, we assume the memory samples of B can help fit? I don't know do I understand correctly.

from itaml.

brjathu commented on August 24, 2024

Yes, that's one reason the classifier for a task is trained only in the inner loop. Also, to minimize the biased model we take a weighted average of the weights as we progress. Yes, finetuning with the memory samples helps to get a better model.

from itaml.

JoyHuYY1412 commented on August 24, 2024

Yes, that's one reason the classifier for a task is trained only in the inner loop. Also, to minimize the biased model we take a weighted average of the weights as we progress.
Thank you so much. I have two more questions.

I read the pseudo code (algorithm 1) in your paper, so after we update phi in line 14, does the theta used in line 7 for task 1 is initialized from the updated phi ? and then task 2 initialized its theta from task 1?
If so, does this operation somehow relieve the imbalance between tasks? Since after each update in the outer loop, the backbone network is reset.

from itaml.

brjathu commented on August 24, 2024

No, in the inner loop theta for all tasks are initialized with last updated phi, and once we learned all thetas we combine them to get the new phi, which is later used to initialize thetas for the next batch.
Yes, the outer loop meta updated tries to minimize the forgetting, while imbalance is minimized mostly because of the exponential averaging of the weights.

from itaml.

Recommend Projects

about adding parameters about itaml HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent