Git Product home page Git Product logo

single-index-ht's Introduction

Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise

This repository explores various methods to generate heavy tails in the weight matrix spectrum of neural networks without the influence of gradient noise. We specifically train shallow neural networks using full-batch Gradient Descent (GD) or Adam optimizer with large learning rates over multiple steps.

Setup

To get started, set up your virtual environment and install the required dependencies:

$ python3.9 -m venv .venv
$ source .venv/bin/activate
$ pip install -r requirements.txt

Experiments

Single Configuration Runs

Investigate the properties of weights, features, overlap matrices, and more for a single configuration:

(.venv) $ python main.py configs/main.yml

To run with a learning rate schedule:

(.venv) $ python main.py configs/main_lr_schedule.yml

Varying Learning Rates for GD/Adam

Conduct experiments with multiple runs to plot losses, Kernel Target Alignment (KTA), and Power Law (PL) Alphas for different learning rates and optimizers:

(.venv) $ python bulk_lr.py configs/bulk_lr.yml

Losses with Varying Parameters

Perform experiments with multiple runs to plot the losses for different parameter settings:

Varying Dataset Size: n

(.venv) $ python bulk_losses.py configs/bulk_losses_vary_n.yml

Varying Regularization Parameter for Regression: reg_lambda

(.venv) $ python bulk_losses.py configs/bulk_losses_vary_reg_lambda.yml

Varying Label Noise: label_noise_std

(.venv) $ python bulk_losses.py configs/bulk_losses_vary_label_noise_std.yml

Varying Decay Factor of StepLR Learning Rate Schedule: gamma

(.venv) $ python bulk_losses.py configs/bulk_losses_vary_step_lr_gamma.yml

Output

The outputs of the experiments are stored in the out/ directory, named according to a hash value based on the experiment context.

Citation

@misc{kothapalli2024crafting,
      title={Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise}, 
      author={Vignesh Kothapalli and Tianyu Pang and Shenyang Deng and Zongmin Liu and Yaoqing Yang},
      year={2024},
      eprint={2406.04657},
      archivePrefix={arXiv},
}

single-index-ht's People

Contributors

kvignesh1420 avatar tdcsz327 avatar

Stargazers

Yaoqing Yang avatar Dsyforever avatar  avatar

Watchers

Yaoqing Yang avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.