State Transition of Dendritic Spines Improves Learning of Sparse Spiking Neural Networks

This repo contains the code reproducing the results of STDS (State Transition of Dendritic Spines) in this paper, which is modified based on the open-source code of SEW ResNet.

Directory Tree
Dependency
Environment
Usage
Citation

Directory Tree

.
├── CIFAR10
│   ├── model.py
│   ├── optim.py
│   ├── train.py
│   └── logs
└── ImageNet
    ├── optim.py
    ├── sew_resnet.py
    ├── train.py
    ├── utils.py
    └── logs
        ├── linear
        └── sine

Dependency

The major dependencies of this code are list as below

# Name                    Version
cudatoolkit               10.2.89
cudnn                     8.2.1.32
cupy                      9.6.0
numpy                     1.21.4
python                    3.7.11 
pytorch                   1.9.1
spikingjelly              <Specific Version>
tensorboard               2.7.0
torchvision               0.10.1

Note: the version of spikingjelly will be clarified in usage part.

Environment

The running of code requires NVIDIA GPU and has been tested on CUDA 10.2 and Ubuntu 16.04. The hardware platform used in experiments is shown below.

GPU: Tesla V100-SXM3-32GB 350 Watts version
CPU: Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz

Each trial on ImageNet requires 8 GPUs. For CIFAR-10, each trial requires only a single GPU.

Usage

This code requires a specified version of an open-source SNN framework SpikingJelly. To get this framework installed, first clone the repo from GitHub:

$ git clone https://github.com/fangwei123456/spikingjelly.git

Then, checkout the version we use in these experiments and install it.

$ cd spikingjelly
$ git checkout d8cc6a5
$ python setup.py install

With dependency mentioned above installed, you should be able to run the following commands:

ImageNet

Dense training:

$ python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --cos_lr_T 320 --model sew_resnet18 -b 32 --output-dir <log dir> --tb --print-freq 4096 --amp --connect_f ADD --T 4 --lr 0.1 --epoch 320 --data-path <dataset path> --sparse-function identity

Our proposed algorithm:

$ python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --cos_lr_T 320 --model sew_resnet18 -b 32 --output-dir <log dir> --tb --print-freq 4096 --amp --connect_f ADD --T 4 --lr 0.1 --epoch 320 --data-path <dataset path> --sparse-function stmod --flat-width <D> --gradual <scheduler type>

Grad R:

$ python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --cos_lr_T 320 --model sew_resnet18 -b 32 --output-dir <log dir> --tb --print-freq 4096 --amp --connect_f ADD --T 4 --lr 0.1 --epoch 320 --alpha-gr <alpha in Grad R> --data-path <dataset path> --sparse-function stmod --flat-width <mu in Grad R>

The TensorBoard logs and checkpoints will be placed in two separate directories in ./logs.

Running Arguments

Arguments	Descriptions	Value	Type
--cos_lr_T	Total steps of Cosine Annealing scheduler of learning rate	320	int
-b,--batch-size	Training batch size	32	int
--alpha-gr	Hyperparameter $\alpha$ in Grad R	None	float
--data-path	Path of datasets		str
--output-dir	Path for dumping models and logs		str
--print-freq	Frequency of print of status during training	4096	int
--amp	Whether to use mixed precision training		bool
--connect_f	Connection function of SEW ResNet	ADD	str
-T	Simulation time-steps of SNNs	4	int
--lr	Learning rate	0.1	float
--epoch	Number of training epochs	320	int
--sparse-function	Reparameterization function	'stmod' for pruning, 'identity' for training dense model	str
--flat-width	Hyperparameter $D$ in our work and $\mu$ in Grad R		float
--gradual	Scheduler type	'sine', 'linear'	str

CIFAR-10

Dense training:

$ python train.py --dataset-dir <dataset path> --dump-dir . --sparse-function identity --amp

Our proposed algorithm:

$ python train.py --dataset-dir <dataset path> --dump-dir . --sparse-function stmod --gradual <scheduler type> --flat-width <D> --amp

Running Arguments

Arguments	Descriptions	Value	Type
-b, --batch-size	Training batch size	16	int
--lr	Learning rate	1e-4	float
--dataset-dir	Path of datasets		str
--dump-dir	Path for dumping models and logs		str
-T	Simulation time-steps of SNNs	8	int
-N, --epoch	Number of training epochs	2048	int
-test	Whether test only		bool
--amp	Whether to use mixed precision training		bool
--sparse-function	Reparameterization function	'stmod' for pruning, 'identity' for training dense model	str
--flat-width	Hyperparameter $D$ in our work and $\mu$ in Grad R		float
--gradual	Scheduler type	'sine', 'linear'	str

Citation

Please refer to the following citation if this work is useful for your research.

@InProceedings{pmlr-v162-chen22ac,
  title = 	 {State Transition of Dendritic Spines Improves Learning of Sparse Spiking Neural Networks},
  author =       {Chen, Yanqi and Yu, Zhaofei and Fang, Wei and Ma, Zhengyu and Huang, Tiejun and Tian, Yonghong},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {3701--3715},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/chen22ac/chen22ac.pdf},
  url = 	 {https://proceedings.mlr.press/v162/chen22ac.html}
}

ikarosy / stds Goto Github PK

stds's Introduction

State Transition of Dendritic Spines Improves Learning of Sparse Spiking Neural Networks

Directory Tree

Dependency

Environment

Usage

ImageNet

Dense training:

Our proposed algorithm:

Grad R:

Running Arguments

CIFAR-10

Dense training:

Our proposed algorithm:

Running Arguments

Citation

stds's People

Contributors

Recommend Projects

Recommend Topics

Recommend Org