Git Product home page Git Product logo

Comments (9)

Blealtan avatar Blealtan commented on August 17, 2024 2

Select the corresponding parameter and .requires_grad_(False) on them. It's the same as freezing some parameters when you do parameter-efficient finetuning. If later I have the time I may add corresponding helper methods for that.

The parameters are:

sp_trainable -> spline_scaler
sb_trainable -> base_weight

Sadly I forgot to implement the bias term! (and thank you for letting me notice that XD)

from efficient-kan.

rafaelcp avatar rafaelcp commented on August 17, 2024 2

Anyone else found that KAN's effectiveness against catastrophic forgetting works only on 1D tasks? Since the spline locality applies to each dimension independently, it can't isolate more complex patterns based on multiple input dimensions (e.g.: an MNIST digit). I think a simple 2D experiment with some Gaussian bumps would be enough to demonstrate this shortcome.

from efficient-kan.

ASCIIJK avatar ASCIIJK commented on August 17, 2024 2

Anyone else found that KAN's effectiveness against catastrophic forgetting works only on 1D tasks? Since the spline locality applies to each dimension independently, it can't isolate more complex patterns based on multiple input dimensions (e.g.: an MNIST digit). I think a simple 2D experiment with some Gaussian bumps would be enough to demonstrate this shortcome.

I have made the experiments with the offical KAN, which shows the catastrophic forgetting issue. Specifically, we use a mixed 2D Gaussian distribution with 5 peaks to construct a CL tasks, which shows as bellow:
Ground_task5

And the model learns each peak with 50,000 data points. For exemple, the data points of first task is showed as bellow:
Ground_task0

Then, we get the results after 5 tasks:
Pred_task4

This forgetting issue occurrs in each task, such as task 1:
Pred_task1

PS: We use the model: "model = KAN(width=[2, 16, 1], grid=5, k=6, noise_scale=0.1, bias_trainable=False, sp_trainable=False, sb_trainable=False)". And we have made sure that the loss is down to zero at each task. So you can find a perfect peak as the same as training data. We think that KAN maybe hard to learn the high-dimensional data without forgetting.

from efficient-kan.

ASCIIJK avatar ASCIIJK commented on August 17, 2024 1

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

from efficient-kan.

lukmanulhakeem97 avatar lukmanulhakeem97 commented on August 17, 2024 1

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

working on similiar case. For me, i just trained two task initially. but it still have catastrophic forgetting issue.

from efficient-kan.

ASCIIJK avatar ASCIIJK commented on August 17, 2024 1

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

Bro, here u pass output from cnn (512 size vector for each image) into KAN is it? Can you share your code?

I just simply replace the last fc with a KANLayer. And I compare it with the replay method which only store 20 old samples for future training. Its results shows little improvement. PS: "KANLinear(in_features=512, out_features=100)", I directly initialize this layer into 100 categories to avoid adjust the dimension in subsequent tasks.

okay, so for KANLayer you pass only tasks specific data, wilthout old task's replay sample.?

No, I use the same data as replay, which contains 20 samples per category. That's why I say KAN may be hard to deal with high dimension data. But I actually find that KAN can handle the simple data (such as classifying a 2D scatter) and avoid forgetting.

from efficient-kan.

lukmanulhakeem97 avatar lukmanulhakeem97 commented on August 17, 2024

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

Bro, here u pass output from cnn (512 size vector for each image) into KAN is it? Can you share your code?

from efficient-kan.

ASCIIJK avatar ASCIIJK commented on August 17, 2024

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

Bro, here u pass output from cnn (512 size vector for each image) into KAN is it? Can you share your code?

I just simply replace the last fc with a KANLayer. And I compare it with the replay method which only store 20 old samples for future training. Its results shows little improvement. PS: "KANLinear(in_features=512, out_features=100)", I directly initialize this layer into 100 categories to avoid adjust the dimension in subsequent tasks.

from efficient-kan.

lukmanulhakeem97 avatar lukmanulhakeem97 commented on August 17, 2024

Is it the reason for little effectness of CIL (class incremental learning)? I have tried to replace the fc layer directly with KAN layer on image classification task (CIFAR-100 B50inc5). It's just a little improvement (maybe 0.5%?).

Bro, here u pass output from cnn (512 size vector for each image) into KAN is it? Can you share your code?

I just simply replace the last fc with a KANLayer. And I compare it with the replay method which only store 20 old samples for future training. Its results shows little improvement. PS: "KANLinear(in_features=512, out_features=100)", I directly initialize this layer into 100 categories to avoid adjust the dimension in subsequent tasks.

okay, so for KANLayer you pass only tasks specific data, wilthout old task's replay sample.?

from efficient-kan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.