sun-hailong / lamda-pilot Goto Github PK

🎉 PILOT: A Pre-trained Model-Based Continual Learning Toolbox

License: MIT License

Python 100.00%

machine-learning continual-learning deep-learning incremental-learning pre-trained-models vision-language-model vision-transformer reproducible-research lifelong-learning pytorch toolkit

lamda-pilot's Issues

Seems that Dual-Prompt's implement has some issues

In backbone/vision_transformer_dual_prompt.py, line 250, it seems that args "g_prompt" isn't concatenated with x, and doesn't participate in forward propagation. Maybe I made a mistake, but as stated in the original paper, g_prompt needs to be update, and e_prompt has the similar problem. I'm looking forward to your response. Thank you very much.

imageneta-B0-Inc20

Running dualprompt

Hi,
first of all, thanks for your great work!

I was running into errors when trying to run the code with python main.py --config=./exps/dualprompt.json:

File "MA-Pilot-MM/backbone/vit_dualprompt.py", line 856, in _create_vision_transformer pretrained_custom_load='npz' in pretrained_cfg['url'], TypeError: 'PretrainedCfg' object is not subscriptable -> it seems that the call should be pretrained_cfg.url, which is also a problem for the coda_prompt.json
from backbone.prompt import EPrompt - No module named 'backbone' -> __init__.py is missing in backbone/
File "MA-Pilot-MM/backbone/vit_dualprompt.py", line 852, in _create_vision_transformer model = build_model_with_cfg( File "/home/users1/ostertms/.local/lib/python3.12/site-packages/timm/models/_builder.py", line 398, in build_model_with_cfg model = model_cls(**kwargs) TypeError: VisionTransformer.__init__() got an unexpected keyword argument 'pretrained_custom_load'
File "/home/users1/ostertms/.local/lib/python3.12/site-packages/torchvision/datasets/folder.py", line 41, in find_classes classes = sorted(entry.name for entry in os.scandir(directory) if entry.is_dir() FileNotFoundError: [Errno 2] No such file or directory: './data/imagenet-r/train/'

Maybe you have stronger requirements on the package/python versions?

My set-up: I cloned the repo and then installed the requirements as described in the README.md in a conda environment with python=3.12.

Best,
Magnus

Optimization and weight_decay on the final linear classifier

Hi everyone,
Thank you all for your hard work! :)
I have a question regarding the management of optimization for the incremental linear classifier. In the following, I use the FineTuning approach as an example.

Here the definition of the optimizer when self._cur_task != 0

optimizer = optim.SGD(
                self._network.parameters(),
                lr=self.args["lrate"],
                momentum=0.9,
                weight_decay=self.args["weight_decay"],
            )

In the function self._update_representation, within the training loop and using the above optimizer, you execute the following lines:

logits = self._network(inputs)["logits"]
fake_targets = targets - self._known_classes
loss_clf = F.cross_entropy(
    logits[:, self._known_classes :], fake_targets
)

loss = loss_clf
optimizer.zero_grad()
loss.backward()

I understand that by using logits[:, self._known_classes:], you consider only the new classes as desired in the cross-entropy loss (the others are already trained during the previous steps). However, in the optimizer, you include all the network parameters ( self._network.parameters()), particularly all the parameters of the linear head. Consequently, the weight decay (set to 0.0002) should also influence the weights related to the old classes. Is this correct?

If this is true, I suppose this is not a desired behavior because the optimum of the weight regularization for weights that are not involved in the cross-entropy should be zero.

Many thanks!

Final average accuracy not printed in logs

Hi @sun-hailong,

At the end of a run, the metrics (in this case for exps/simplecil_inr.json) are printed out like this:

2024-01-18 10:00:30,251 [trainer.py] => No NME accuracy.
2024-01-18 10:00:30,251 [trainer.py] => CNN: {'total': 61.28, '00-19': 60.96, '20-39': 61.39, '40-59': 62.38, '60-79': 63.4, '80-99': 56.06, '100-119': 60.45, '120-139': 63.44, '140-159': 62.41, '160-179': 61.3, '180-199': 60.1, 'old': 61.41, 'new': 60.1}
2024-01-18 10:00:30,251 [trainer.py] => CNN top1 curve: [78.96, 72.14, 70.11, 68.29, 66.0, 64.42, 64.05, 63.08, 62.28, 61.28]
2024-01-18 10:00:30,251 [trainer.py] => CNN top5 curve: [95.36, 90.01, 86.4, 84.07, 82.35, 80.8, 79.9, 78.94, 77.75, 76.68]

Average Accuracy (CNN): 67.061
2024-01-18 10:00:30,251 [trainer.py] => Average Accuracy (CNN): 67.061 

Accuracy Matrix (CNN):
[[78.96 75.33 71.41 68.94 67.2  65.02 64.59 62.84 61.39 60.96]
 [ 0.   68.67 66.46 65.66 65.35 64.56 63.77 62.5  62.34 61.39]
 [ 0.    0.   72.44 69.64 66.83 65.68 65.02 64.69 63.37 62.38]
 [ 0.    0.    0.   69.   67.78 65.85 65.5  64.97 64.1  63.4 ]
 [ 0.    0.    0.    0.   62.01 60.78 59.34 57.7  56.88 56.06]
 [ 0.    0.    0.    0.    0.   63.94 63.18 61.67 61.06 60.45]
 [ 0.    0.    0.    0.    0.    0.   66.26 65.55 64.15 63.44]
 [ 0.    0.    0.    0.    0.    0.    0.   63.92 63.37 62.41]
 [ 0.    0.    0.    0.    0.    0.    0.    0.   63.18 61.3 ]
 [ 0.    0.    0.    0.    0.    0.    0.    0.    0.   60.1 ]]
2024-01-18 10:00:30,251 [trainer.py] => Forgetting (CNN): 6.2877777777777775

The issue is that none of the printed/logged metrics include "Final average accuracy", which is the most often used CIL metric.

The printed "Average Accuracy" is the average of the final CNN top 1 curve. But this is defined quite differently to "Final average accuracy"
The "Final Average Accuracy" is easy to compute from the information you do print: its just the average of your task split values in the CNN dict, or equivalently the average of the final column in "Accuracy Matrix (CNN)"
In this example the Final Average Accuracy calculated from the accuracy matrix is 61.189%
- often its close to the last value in the top-1 curve, but not always, depending on the dataset

It would be appreciated if you could change what gets printed out so that comparisons with values reported in the literature are much more easily made.

As an aside, "Accuracy Matrix (CNN)" is only printed to the terminal, and not logged to the log file. It would be good to have it in the log file.

memo method: TypeError: argument of type 'PretrainedCfg' is not iterable

Hello,

Thank you for such an amazing framework.
I can run Foster and DER methods without any error but encounter an issue when I try to run the 'memo' method.
I would be so happy if you could help me with this.
Thank you so much.

`File "C:\PycharmProjects\LAMDA-PILOT\main.py", line 25, in
main()

File "C:\PycharmProjects\LAMDA-PILOT\main.py", line 11, in main
train(args)

File "C:\PycharmProjects\LAMDA-PILOT\trainer.py", line 19, in train
_train(args)

File "C:\PycharmProjects\LAMDA-PILOT\trainer.py", line 63, in _train
model = factory.get_model(args["model_name"], args)

File "C:\PycharmProjects\LAMDA-PILOT\utils\factory.py", line 36, in get_model
return Learner(args)

File "C:\PycharmProjects\LAMDA-PILOT\models\memo.py", line 23, in init
self._network = AdaptiveNet(args, True)

File "C:\PycharmProjects\LAMDA-PILOT\utils\inc_net.py", line 829, in init
self.TaskAgnosticExtractor , _ = get_backbone(args, pretrained) #Generalized blocks

File "C:\PycharmProjects\LAMDA-PILOT\utils\inc_net.py", line 25, in get_backbone
_basenet, _adaptive_net = timm.create_model("vit_base_patch16_224_memo", pretrained=True, num_classes=0)

File "C:\Anaconda3\envs\yildirimceren\lib\site-packages\timm\models_factory.py", line 117, in create_model
model = create_fn(

File "C:\PycharmProjects\LAMDA-PILOT\backbone\vision_transformer_memo.py", line 896, in vit_base_patch16_224_memo
base_model = _create_vision_transformer_base('vit_base_patch16_224', pretrained=pretrained, **model_kwargs)

File "C:\PycharmProjects\LAMDA-PILOT\backbone\vision_transformer_memo.py", line 867, in _create_vision_transformer_base
pretrained_custom_load='npz' in pretrained_cfg,

TypeError: argument of type 'PretrainedCfg' is not iterable
`

No modele named 'easydict'

I have a try to run your new SOTA 'ease'. But I find it should import a EasyDict config from outspace 'easydict'. I find no 'easydict' in your library and I also try to use 'pip' to serch this module, also not found.

Pretrained ResNet

Hello,
I really enjoy working with this great framework, thank you so much.
I'm interested in utilizing the ResNet architecture as well. Would you be able to provide the pre-trained ResNet implementation?
Thank you in advance.

Code Support of "Learning without Forgetting for Vision-Language Models" (PROOF)

Thanks authors for releasing this fantastic toolbox! This great work saves my life! I'm just wondering whether there will be support for a recent work also from you on "Learning without Forgetting for Vision-Language Models" (PROOF)? It would be great if you could also release support for this wonderful work also.

Many thanks in advance!

When I try to use non-continuous learning to obtain “Upper-Bound” data

"prefix": "reproduce",
"dataset": "imagenetr",
"memory_size": 0,
"memory_per_class": 0,
"fixed_memory": false,
"shuffle": true,
"init_cls": 200,
"increment": 0,
"model_name": "finetune",
"backbone_type": "vit_base_patch16_224",
"device": ["1"],
"seed": [1993],
"init_epoch": 20,
"init_lr": 1e-3,
"init_milestones": [60, 120, 170],
"init_lr_decay": 0.1,
"init_weight_decay": 0.0005,
"epochs": 20,
"lrate": 1e-3,
"milestones": [40, 70],
"lrate_decay": 0.1,
"batch_size": 128,
"weight_decay": 2e-4

As you can see, when I set init_cls=200 and incremental=0 (which aligns with our intuition in non continuous learning scenarios).
But a bug occurred:

Traceback (most recent call last):
File "/media/user/data1/ldz cil/clip4prompt/main. py", line 33, in
Main()
File "/media/user/data1/ldz cil/clip4prompt/main. py", line 19, in main
Train (args)
File "/media/user/data1/ldz cil/clip4prompt/trainer. py", line 19, in train
Train (args)
File "/media/user/data1/ldz cil/clip4prompt/trainer. py", line 77, in train
Cnn.accy, nme-accy=model. eval_task()
File "/media/user/data1/ldz cil/clip4prompt/models/base. py", line 118, in eval_task
Cnn.accy=self Evaluate (y_pred, y_true)
File "/media/user/data1/ldz cil/clip4prompt/models/base. py", line 106, in _evaluate
Grouped=accuracy (y_pred. T [0], y_true, self. _knowledge-based classes, self. args ["init_cls"], self. args ["increment"])
File "/media/user/data1/ldz cil/clip4prompt/utils/toolkit. py", line 121, in accuracy
For class_id in range (init_cls, np. max (y_true), increase):
Value Error: range() arg 3 must not be zero

Is this setting correct? Or rather, the author did not provide an interface to retrieve Upper data

Questions regarding dualprompt

Hi @sun-hailong ,
thanks for the great work!

I was going through the code for dualprompt and now have a few small questions / observations:

the easydict library is not part of the requirements, but is imported
vision_transformer_adapter in inc_net is undefined (probably it should be vit_adapter)
why do we need get_original_backbone() for dualprompt ViT?
regarding the parameters in dualprompt_inr.json or dualprompt.json, a few parameters seem strangely or unnecessarily set
- reinit_optimizer doesn't change anything in the code, the optimizer is reinitialized regardless?!
- prefix is only logged, but not used
- global_pool, initializer, predefined_key is set, but nowhere used
- in timm.create_model() both for dualprompt and l2p both the prompt_init and prompt_key_init arguments are set to the same argument (and consequently the prompt_init parameter is not used) - which is also upstream in the dualprompt implementation

Best,
Magnus

Why L2 norm was not always adopted in the exemplar construction?

Hi,

Firstly, thanks for sharing the code with the community!

When I read the code, I found that when constructing the exemplar set, $L_2$-norm was sometimes adopted but sometimes not.

Could you please provide any help?

Calculating the sample features vectors,

LAMDA-PILOT/models/base.py

Line 230 in b253937