Similar like what authors shown in official git repo, can use this efficient-kan model

Can configure efficient-kan model for continual learning? about efficient-kan HOT 9 OPEN

lukmanulhakeem97 commented on August 17, 2024

Can configure efficient-kan model for continual learning?

from efficient-kan.

Comments (9)

Blealtan commented on August 17, 2024 2

Select the corresponding parameter and .requires_grad_(False) on them. It's the same as freezing some parameters when you do parameter-efficient finetuning. If later I have the time I may add corresponding helper methods for that.

The parameters are:

sp_trainable -> spline_scaler
sb_trainable -> base_weight

Sadly I forgot to implement the bias term! (and thank you for letting me notice that XD)

from efficient-kan.

rafaelcp commented on August 17, 2024 2

Anyone else found that KAN's effectiveness against catastrophic forgetting works only on 1D tasks? Since the spline locality applies to each dimension independently, it can't isolate more complex patterns based on multiple input dimensions (e.g.: an MNIST digit). I think a simple 2D experiment with some Gaussian bumps would be enough to demonstrate this shortcome.

from efficient-kan.

ASCIIJK commented on August 17, 2024 2

Anyone else found that KAN's effectiveness against catastrophic forgetting works only on 1D tasks? Since the spline locality applies to each dimension independently, it can't isolate more complex patterns based on multiple input dimensions (e.g.: an MNIST digit). I think a simple 2D experiment with some Gaussian bumps would be enough to demonstrate this shortcome.

I have made the experiments with the offical KAN, which shows the catastrophic forgetting issue. Specifically, we use a mixed 2D Gaussian distribution with 5 peaks to construct a CL tasks, which shows as bellow:

And the model learns each peak with 50,000 data points. For exemple, the data points of first task is showed as bellow：

Then, we get the results after 5 tasks:

This forgetting issue occurrs in each task, such as task 1:

PS: We use the model: "model = KAN(width=[2, 16, 1], grid=5, k=6, noise_scale=0.1, bias_trainable=False, sp_trainable=False, sb_trainable=False)". And we have made sure that the loss is down to zero at each task. So you can find a perfect peak as the same as training data. We think that KAN maybe hard to learn the high-dimensional data without forgetting.

from efficient-kan.

ASCIIJK commented on August 17, 2024 1

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

from efficient-kan.

lukmanulhakeem97 commented on August 17, 2024 1

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

working on similiar case. For me, i just trained two task initially. but it still have catastrophic forgetting issue.

from efficient-kan.

ASCIIJK commented on August 17, 2024 1

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

Bro, here u pass output from cnn (512 size vector for each image) into KAN is it? Can you share your code?

I just simply replace the last fc with a KANLayer. And I compare it with the replay method which only store 20 old samples for future training. Its results shows little improvement. PS: "KANLinear(in_features=512, out_features=100)", I directly initialize this layer into 100 categories to avoid adjust the dimension in subsequent tasks.

okay, so for KANLayer you pass only tasks specific data, wilthout old task's replay sample.?

No, I use the same data as replay, which contains 20 samples per category. That's why I say KAN may be hard to deal with high dimension data. But I actually find that KAN can handle the simple data (such as classifying a 2D scatter) and avoid forgetting.

from efficient-kan.

lukmanulhakeem97 commented on August 17, 2024

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

Bro, here u pass output from cnn (512 size vector for each image) into KAN is it? Can you share your code?

from efficient-kan.

ASCIIJK commented on August 17, 2024

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

Bro, here u pass output from cnn (512 size vector for each image) into KAN is it? Can you share your code?

I just simply replace the last fc with a KANLayer. And I compare it with the replay method which only store 20 old samples for future training. Its results shows little improvement. PS: "KANLinear(in_features=512, out_features=100)", I directly initialize this layer into 100 categories to avoid adjust the dimension in subsequent tasks.

from efficient-kan.

lukmanulhakeem97 commented on August 17, 2024

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

Bro, here u pass output from cnn (512 size vector for each image) into KAN is it? Can you share your code?

I just simply replace the last fc with a KANLayer. And I compare it with the replay method which only store 20 old samples for future training. Its results shows little improvement. PS: "KANLinear(in_features=512, out_features=100)", I directly initialize this layer into 100 categories to avoid adjust the dimension in subsequent tasks.

okay, so for KANLayer you pass only tasks specific data, wilthout old task's replay sample.?

from efficient-kan.

Can configure efficient-kan model for continual learning? about efficient-kan HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent