cnn-kan's Introduction

CNN-KAN

Single epoch CNN+KAN trial on MNIST with 96% accuracy. You could give it a shot with the Notebook or via the Colab link.

Remarks

Loss is very unstable.

Maybe its related to parameter settings, however whenever I try few epochs (3+) weights kept exploding.

Train Epoch: 0 [0000/6000 (00%)]	Loss: 1.662824
Train Epoch: 0 [0640/6000 (11%)]	Loss: 0.500479
Train Epoch: 0 [1280/6000 (21%)]	Loss: 0.287909
Train Epoch: 0 [1920/6000 (32%)]	Loss: 0.221105
Train Epoch: 0 [2560/6000 (43%)]	Loss: 0.107704
Train Epoch: 0 [3200/6000 (53%)]	Loss: 0.146229
Train Epoch: 0 [3840/6000 (64%)]	Loss: 0.077108
Train Epoch: 0 [4480/6000 (74%)]	Loss: 0.157832
Train Epoch: 0 [5120/6000 (85%)]	Loss: 0.075530
Train Epoch: 0 [5760/6000 (96%)]	Loss: 0.064992

Test set: Average loss: 0.0007, Accuracy: 9620/10000 (96%)

Other backprop algorithms I tried did not work.
SeLu instead of ReLu like usual.
2 KAN layers were more stable than a single one.

Acknowledgement

Kolmogorov Arnold Networks (Original work): pyKAN

Fourier coefficient instead of splines: FourierKAN

cnn-kan's People

Contributors

Stargazers

Watchers

cnn-kan's Issues

loss?

Hello, I'm sorry to bother you. I want to ask you during the training process encountered full memory situation?(I fixed this error in the following way
) What is the problem that the loss of this code does not decrease during training?

loss_fn = nn.CrossEntropyLoss()

# Training function
def train(model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        optimizer.zero_grad()
        output = model(data.to(device))
        loss = loss_fn(output, target.to(device))
        loss.backward()
        optimizer.step()
        if batch_idx % 10 == 0:
            print(f'Train Epoch: {epoch} [{batch_idx * len(data)}/{len(train_loader.dataset)} ({100. * batch_idx / len(train_loader):.0f}%)]\tLoss: {loss.item():.6f}')

def train(model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        def closure():
            optimizer.zero_grad()
            output = model(data.to(device))
            loss = nn.CrossEntropyLoss()(output, target.to(device))
            loss.backward()
            return loss
        optimizer.step(closure)

Recommend Projects