Git Product home page Git Product logo

self-adaptive-training's Introduction

Self-Adaptive Training

This is the PyTorch implementation of the

Self-adaptive training significantly improves the generalization of deep networks under noise and enhances the self-supervised representation learning. It also advances the state-of-the-art on learning with noisy label, adversarial training and the linear evaluation on the learned representation.

News

  • 2021.10: The code of Selective Classification for SAT has been released here.
  • 2021.01: We have released the journal version of Self-Adaptive Training, which is a unified algorithm for both the supervised and self-supervised learning. Code for self-supervised learning will be available soon.
  • 2020.09: Our work has been accepted at NeurIPS'2020.

Requirements

  • Python >= 3.6
  • PyTorch >= 1.0
  • CUDA
  • Numpy

Usage

Standard training

The main.py contains training and evaluation functions in standard training setting.

Runnable scripts

  • Training and evaluation using the default parameters

    We provide our training scripts in directory scripts/. For a concrete example, we can use the command as below to train the default model (i.e., ResNet-34) on CIFAR10 dataset with uniform label noise injected (e.g., 40%):

    $ bash scripts/cifar10/run_sat.sh [TRIAL_NAME]
    

    The argument TRIAL_NAME is optional, it helps us to identify different trials of the same experiments without modifying the training script. The evaluation is automatically performed when training is finished.

  • Additional arguments

    • noise-rate: the percentage of data that being corrupted
    • noise-type: type of random corruptions (i.e., corrupted_label, Gaussian,random_pixel, shuffled_pixel)
    • sat-es: initial epochs of our approach
    • sat-alpha: the momentum term $\alpha$ of our approach
    • arch: the architecture of backbone model, e.g., resnet34/wrn34

Results on CIFAR datasets under uniform label noise

  • Test Accuracy(%) on CIFAR10
Noise Rate 0.2 0.4 0.6 0.8
ResNet-34 94.14 92.64 89.23 78.58
WRN-28-10 94.84 93.23 89.42 80.13
  • Test Accuracy(%) on CIFA100
Noise Rate 0.2 0.4 0.6 0.8
ResNet-34 75.77 71.38 62.69 38.72
WRN-28-10 77.71 72.60 64.87 44.17

Runnable scripts for repreducing double-descent phenomenon

You can use the command as below to train the default model (i.e., ResNet-18) on CIFAR10 dataset with 16.67% uniform label noise injected (i.e., 15% label error rate):

$ bash scripts/cifar10/run_sat_dd_parallel.sh [TRIAL_NAME]
$ bash scripts/cifar10/run_ce_dd_parallel.sh [TRIAL_NAME]

Double-descent ERM vs. single-descent self-adaptive training

Double-descent ERM vs. single-descent self-adaptive training on the error-capacity curve. The vertical dashed line represents the interpolation threshold.

Double-descent ERM vs. single-descent self-adaptive training on the epoch-capacity curve. The dashed vertical line represents the initial epoch E_s of our approach.

Adversarial training

We use state-of-the-art adversarial training algorithm TRADES as our baseline. The main_adv.py contains training and evaluation functions in adversarial training setting on CIFAR10 dataset.

Training scripts

  • Training and evaluation using the default parameters

    We provides our training scripts in directory scripts/cifar10. For a concrete example, we can use the command as below to train the default model (i.e., WRN34-10) on CIFAR10 dataset with PGD-10 attack ($\epsilon$=0.031) to generate adversarial examples:

    $ bash scripts/cifar10/run_trades_sat.sh [TRIAL_NAME]
    
  • Additional arguments

    • beta: hyper-parameter $1/\lambda$ in TRADES that controls the trade-off between natural accuracy and adversarial robustness
    • sat-es: initial epochs of our approach
    • sat-alpha: the momentum term $\alpha$ of our approach

Robust evaluation script

Evaluate robust WRN-34-10 models on CIFAR10 under PGD-20 attack:

  $ python pgd_attack.py --model-dir "/path/to/checkpoints"

This command evaluates 71-st to 100-th checkpoints in the specified path.

Results

Self-Adaptive Training mitigates the overfitting issue and consistently improves TRADES.

Attack TRADES+SAT

We provide the checkpoint of our best performed model in Google Drive and compare its natural and robust accuracy with TRADES as below.

Attack (submitted by) \ Method TRADES TRADES + SAT
None (initial entry) 84.92 83.48
PGD-20 (initial entry) 56.68 58.03
MultiTargeted-2000 (initial entry) 53.24 53.46
Auto-Attack+ (Francesco Croce) 53.08 53.29

Reference

For technical details, please check the conference version or the journal version of our paper.

@inproceedings{huang2020self,
  title={Self-Adaptive Training: beyond Empirical Risk Minimization},
  author={Huang, Lang and Zhang, Chao and Zhang, Hongyang},
  booktitle={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}

@article{huang2021self,
  title={Self-Adaptive Training: Bridging the Supervised and Self-Supervised Learning},
  author={Huang, Lang and Zhang, Chao and Zhang, Hongyang},
  journal={arXiv preprint arXiv:2101.08732},
  year={2021}
}

Contact

If you have any question about this code, feel free to open an issue or contact [email protected].

self-adaptive-training's People

Contributors

layneh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

self-adaptive-training's Issues

Checkpoints for TRADES

Hi,

Thanks for making the code pubic. I am particularly intrigued by the improvement in robust accuracy (in combination with Trades). This seems a very strong result (could be improved further using https://arxiv.org/abs/1905.13736) and I am really interested in testing it further. Would it be possible for you to make the adversarial trained network's checkpoints public?

Test the cifar100 dataset with low natural accuracy and robust accuracy

Hi,

I use the training procedure to run the self-adaptive-training, however, during testing, the accuracy and robust_accuracy go down to about 0.2.

What I have done to pgd_attack.py is only change the load dataset of cifar-10 to cifar-100, the model is setting with 100-classes output.

I use the default run_sat.sh setting to train the adversarial defense model.

What is the root cause for the low accuracy, how to resolve it?

Thanks & Regards!
Momo

About the computation power of "Self-Adaptive Training: beyond Empirical Risk Minimization"

Hello,

Thanks for your sharing and your outstanding contributions for adversarial learning.

I wonder how much computation power is needed to run the default settings, e.g., batch_size=128.

I tried to reproduce your experiment on a server with 2080Ti and memories of 11G, then the server restarted.

Should I make the batch size smaller to reproduce it?

Could you list the corresponding GPU architectures which is able to run it? How many GPU memories are needed.

Thanks & Regards!
Momo

About the evaluation problem after adversarial-training

Thanks a lot for sharing.

I have tried the training program in our workstation with 2080-GPU and batchsize=64.

However, during the evaluation, some problem has come:

Files already downloaded and verified
evaluating /home/cdhk409/xiaoyang/Python/adv_training/self-adaptive-training/trained_models/checkpoint_76.tar...
Traceback (most recent call last):
File "pgd_attack.py", line 106, in
main()
File "pgd_attack.py", line 93, in main
model.load_state_dict(torch.load(model_path)['state_dict'])
File "/home/cdhk409/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.conv1.weight", "module.block1.layer.0.bn1.weight", "module.block1.layer.0.bn1.bias", "module.block1.layer.0.bn1.running_mean", "module.block1.layer.0.bn1.running_var", "module.block1.layer.0.conv1.weight", "module.block1.layer.0.bn2.weight", "module.block1.layer.0.bn2.bias", "module.block1.layer.0.bn2.running_mean", "module.block1.layer.0.bn2.running_var", "module.block1.layer.0.conv2.weight", "module.block1.layer.0.convShortcut.weight", "module.block1.layer.1.bn1.weight", "module.block1.layer.1.bn1.bias", "module.block1.layer.1.bn1.running_mean", "module.block1.layer.1.bn1.running_var", "module.block1.layer.1.conv1.weight", "module.block1.layer.1.bn2.weight", "module.block1.layer.1.bn2.bias", "module.block1.layer.1.bn2.running_mean", "module.block1.layer.1.bn2.running_var", "module.block1.layer.1.conv2.weight", "module.block1.layer.2.bn1.weight", "module.block1.layer.2.bn1.bias", "module.block1.layer.2.bn1.running_mean", "module.block1.layer.2.bn1.running_var", "module.block1.layer.2.conv1.weight", "module.block1.layer.2.bn2.weight", "module.block1.layer.2.bn2.bias", "module.block1.layer.2.bn2.running_mean", "module.block1.layer.2.bn2.running_var", "module.block1.layer.2.conv2.weight", "module.block1.layer.3.bn1.weight", "module.block1.layer.3.bn1.bias", "module.block1.layer.3.bn1.running_mean", "module.block1.layer.3.bn1.running_var", "module.block1.layer.3.conv1.weight", "module.block1.layer.3.bn2.weight", "module.block1.layer.3.bn2.bias", "module.block1.layer.3.bn2.running_mean", "module.block1.layer.3.bn2.running_var", "module.block1.layer.3.conv2.weight", "module.block1.layer.4.bn1.weight", "module.block1.layer.4.bn1.bias", "module.block1.layer.4.bn1.running_mean", "module.block1.layer.4.bn1.running_var", "module.block1.layer.4.conv1.weight", "module.block1.layer.4.bn2.weight", "module.block1.layer.4.bn2.bias", "module.block1.layer.4.bn2.running_mean", "module.block1.layer.4.bn2.running_var", "module.block1.layer.4.conv2.weight", "module.block2.layer.0.bn1.weight", "module.block2.layer.0.bn1.bias", "module.block2.layer.0.bn1.running_mean", "module.block2.layer.0.bn1.running_var", "module.block2.layer.0.conv1.weight", "module.block2.layer.0.bn2.weight", "module.block2.layer.0.bn2.bias", "module.block2.layer.0.bn2.running_mean", "module.block2.layer.0.bn2.running_var", "module.block2.layer.0.conv2.weight", "module.block2.layer.0.convShortcut.weight", "module.block2.layer.1.bn1.weight", "module.block2.layer.1.bn1.bias", "module.block2.layer.1.bn1.running_mean", "module.block2.layer.1.bn1.running_var", "module.block2.layer.1.conv1.weight", "module.block2.layer.1.bn2.weight", "module.block2.layer.1.bn2.bias", "module.block2.layer.1.bn2.running_mean", "module.block2.layer.1.bn2.running_var", "module.block2.layer.1.conv2.weight", "module.block2.layer.2.bn1.weight", "module.block2.layer.2.bn1.bias", "module.block2.layer.2.bn1.running_mean", "module.block2.layer.2.bn1.running_var", "module.block2.layer.2.conv1.weight", "module.block2.layer.2.bn2.weight", "module.block2.layer.2.bn2.bias", "module.block2.layer.2.bn2.running_mean", "module.block2.layer.2.bn2.running_var", "module.block2.layer.2.conv2.weight", "module.block2.layer.3.bn1.weight", "module.block2.layer.3.bn1.bias", "module.block2.layer.3.bn1.running_mean", "module.block2.layer.3.bn1.running_var", "module.block2.layer.3.conv1.weight", "module.block2.layer.3.bn2.weight", "module.block2.layer.3.bn2.bias", "module.block2.layer.3.bn2.running_mean", "module.block2.layer.3.bn2.running_var", "module.block2.layer.3.conv2.weight", "module.block2.layer.4.bn1.weight", "module.block2.layer.4.bn1.bias", "module.block2.layer.4.bn1.running_mean", "module.block2.layer.4.bn1.running_var", "module.block2.layer.4.conv1.weight", "module.block2.layer.4.bn2.weight", "module.block2.layer.4.bn2.bias", "module.block2.layer.4.bn2.running_mean", "module.block2.layer.4.bn2.running_var", "module.block2.layer.4.conv2.weight", "module.block3.layer.0.bn1.weight", "module.block3.layer.0.bn1.bias", "module.block3.layer.0.bn1.running_mean", "module.block3.layer.0.bn1.running_var", "module.block3.layer.0.conv1.weight", "module.block3.layer.0.bn2.weight", "module.block3.layer.0.bn2.bias", "module.block3.layer.0.bn2.running_mean", "module.block3.layer.0.bn2.running_var", "module.block3.layer.0.conv2.weight", "module.block3.layer.0.convShortcut.weight", "module.block3.layer.1.bn1.weight", "module.block3.layer.1.bn1.bias", "module.block3.layer.1.bn1.running_mean", "module.block3.layer.1.bn1.running_var", "module.block3.layer.1.conv1.weight", "module.block3.layer.1.bn2.weight", "module.block3.layer.1.bn2.bias", "module.block3.layer.1.bn2.running_mean", "module.block3.layer.1.bn2.running_var", "module.block3.layer.1.conv2.weight", "module.block3.layer.2.bn1.weight", "module.block3.layer.2.bn1.bias", "module.block3.layer.2.bn1.running_mean", "module.block3.layer.2.bn1.running_var", "module.block3.layer.2.conv1.weight", "module.block3.layer.2.bn2.weight", "module.block3.layer.2.bn2.bias", "module.block3.layer.2.bn2.running_mean", "module.block3.layer.2.bn2.running_var", "module.block3.layer.2.conv2.weight", "module.block3.layer.3.bn1.weight", "module.block3.layer.3.bn1.bias", "module.block3.layer.3.bn1.running_mean", "module.block3.layer.3.bn1.running_var", "module.block3.layer.3.conv1.weight", "module.block3.layer.3.bn2.weight", "module.block3.layer.3.bn2.bias", "module.block3.layer.3.bn2.running_mean", "module.block3.layer.3.bn2.running_var", "module.block3.layer.3.conv2.weight", "module.block3.layer.4.bn1.weight", "module.block3.layer.4.bn1.bias", "module.block3.layer.4.bn1.running_mean", "module.block3.layer.4.bn1.running_var", "module.block3.layer.4.conv1.weight", "module.block3.layer.4.bn2.weight", "module.block3.layer.4.bn2.bias", "module.block3.layer.4.bn2.running_mean", "module.block3.layer.4.bn2.running_var", "module.block3.layer.4.conv2.weight", "module.bn1.weight", "module.bn1.bias", "module.bn1.running_mean", "module.bn1.running_var", "module.fc.weight", "module.fc.bias".
Unexpected key(s) in state_dict: "conv1.weight", "block1.layer.0.bn1.weight", "block1.layer.0.bn1.bias", "block1.layer.0.bn1.running_mean", "block1.layer.0.bn1.running_var", "block1.layer.0.bn1.num_batches_tracked", "block1.layer.0.conv1.weight", "block1.layer.0.bn2.weight", "block1.layer.0.bn2.bias", "block1.layer.0.bn2.running_mean", "block1.layer.0.bn2.running_var", "block1.layer.0.bn2.num_batches_tracked", "block1.layer.0.conv2.weight", "block1.layer.0.convShortcut.weight", "block1.layer.1.bn1.weight", "block1.layer.1.bn1.bias", "block1.layer.1.bn1.running_mean", "block1.layer.1.bn1.running_var", "block1.layer.1.bn1.num_batches_tracked", "block1.layer.1.conv1.weight", "block1.layer.1.bn2.weight", "block1.layer.1.bn2.bias", "block1.layer.1.bn2.running_mean", "block1.layer.1.bn2.running_var", "block1.layer.1.bn2.num_batches_tracked", "block1.layer.1.conv2.weight", "block1.layer.2.bn1.weight", "block1.layer.2.bn1.bias", "block1.layer.2.bn1.running_mean", "block1.layer.2.bn1.running_var", "block1.layer.2.bn1.num_batches_tracked", "block1.layer.2.conv1.weight", "block1.layer.2.bn2.weight", "block1.layer.2.bn2.bias", "block1.layer.2.bn2.running_mean", "block1.layer.2.bn2.running_var", "block1.layer.2.bn2.num_batches_tracked", "block1.layer.2.conv2.weight", "block1.layer.3.bn1.weight", "block1.layer.3.bn1.bias", "block1.layer.3.bn1.running_mean", "block1.layer.3.bn1.running_var", "block1.layer.3.bn1.num_batches_tracked", "block1.layer.3.conv1.weight", "block1.layer.3.bn2.weight", "block1.layer.3.bn2.bias", "block1.layer.3.bn2.running_mean", "block1.layer.3.bn2.running_var", "block1.layer.3.bn2.num_batches_tracked", "block1.layer.3.conv2.weight", "block1.layer.4.bn1.weight", "block1.layer.4.bn1.bias", "block1.layer.4.bn1.running_mean", "block1.layer.4.bn1.running_var", "block1.layer.4.bn1.num_batches_tracked", "block1.layer.4.conv1.weight", "block1.layer.4.bn2.weight", "block1.layer.4.bn2.bias", "block1.layer.4.bn2.running_mean", "block1.layer.4.bn2.running_var", "block1.layer.4.bn2.num_batches_tracked", "block1.layer.4.conv2.weight", "block2.layer.0.bn1.weight", "block2.layer.0.bn1.bias", "block2.layer.0.bn1.running_mean", "block2.layer.0.bn1.running_var", "block2.layer.0.bn1.num_batches_tracked", "block2.layer.0.conv1.weight", "block2.layer.0.bn2.weight", "block2.layer.0.bn2.bias", "block2.layer.0.bn2.running_mean", "block2.layer.0.bn2.running_var", "block2.layer.0.bn2.num_batches_tracked", "block2.layer.0.conv2.weight", "block2.layer.0.convShortcut.weight", "block2.layer.1.bn1.weight", "block2.layer.1.bn1.bias", "block2.layer.1.bn1.running_mean", "block2.layer.1.bn1.running_var", "block2.layer.1.bn1.num_batches_tracked", "block2.layer.1.conv1.weight", "block2.layer.1.bn2.weight", "block2.layer.1.bn2.bias", "block2.layer.1.bn2.running_mean", "block2.layer.1.bn2.running_var", "block2.layer.1.bn2.num_batches_tracked", "block2.layer.1.conv2.weight", "block2.layer.2.bn1.weight", "block2.layer.2.bn1.bias", "block2.layer.2.bn1.running_mean", "block2.layer.2.bn1.running_var", "block2.layer.2.bn1.num_batches_tracked", "block2.layer.2.conv1.weight", "block2.layer.2.bn2.weight", "block2.layer.2.bn2.bias", "block2.layer.2.bn2.running_mean", "block2.layer.2.bn2.running_var", "block2.layer.2.bn2.num_batches_tracked", "block2.layer.2.conv2.weight", "block2.layer.3.bn1.weight", "block2.layer.3.bn1.bias", "block2.layer.3.bn1.running_mean", "block2.layer.3.bn1.running_var", "block2.layer.3.bn1.num_batches_tracked", "block2.layer.3.conv1.weight", "block2.layer.3.bn2.weight", "block2.layer.3.bn2.bias", "block2.layer.3.bn2.running_mean", "block2.layer.3.bn2.running_var", "block2.layer.3.bn2.num_batches_tracked", "block2.layer.3.conv2.weight", "block2.layer.4.bn1.weight", "block2.layer.4.bn1.bias", "block2.layer.4.bn1.running_mean", "block2.layer.4.bn1.running_var", "block2.layer.4.bn1.num_batches_tracked", "block2.layer.4.conv1.weight", "block2.layer.4.bn2.weight", "block2.layer.4.bn2.bias", "block2.layer.4.bn2.running_mean", "block2.layer.4.bn2.running_var", "block2.layer.4.bn2.num_batches_tracked", "block2.layer.4.conv2.weight", "block3.layer.0.bn1.weight", "block3.layer.0.bn1.bias", "block3.layer.0.bn1.running_mean", "block3.layer.0.bn1.running_var", "block3.layer.0.bn1.num_batches_tracked", "block3.layer.0.conv1.weight", "block3.layer.0.bn2.weight", "block3.layer.0.bn2.bias", "block3.layer.0.bn2.running_mean", "block3.layer.0.bn2.running_var", "block3.layer.0.bn2.num_batches_tracked", "block3.layer.0.conv2.weight", "block3.layer.0.convShortcut.weight", "block3.layer.1.bn1.weight", "block3.layer.1.bn1.bias", "block3.layer.1.bn1.running_mean", "block3.layer.1.bn1.running_var", "block3.layer.1.bn1.num_batches_tracked", "block3.layer.1.conv1.weight", "block3.layer.1.bn2.weight", "block3.layer.1.bn2.bias", "block3.layer.1.bn2.running_mean", "block3.layer.1.bn2.running_var", "block3.layer.1.bn2.num_batches_tracked", "block3.layer.1.conv2.weight", "block3.layer.2.bn1.weight", "block3.layer.2.bn1.bias", "block3.layer.2.bn1.running_mean", "block3.layer.2.bn1.running_var", "block3.layer.2.bn1.num_batches_tracked", "block3.layer.2.conv1.weight", "block3.layer.2.bn2.weight", "block3.layer.2.bn2.bias", "block3.layer.2.bn2.running_mean", "block3.layer.2.bn2.running_var", "block3.layer.2.bn2.num_batches_tracked", "block3.layer.2.conv2.weight", "block3.layer.3.bn1.weight", "block3.layer.3.bn1.bias", "block3.layer.3.bn1.running_mean", "block3.layer.3.bn1.running_var", "block3.layer.3.bn1.num_batches_tracked", "block3.layer.3.conv1.weight", "block3.layer.3.bn2.weight", "block3.layer.3.bn2.bias", "block3.layer.3.bn2.running_mean", "block3.layer.3.bn2.running_var", "block3.layer.3.bn2.num_batches_tracked", "block3.layer.3.conv2.weight", "block3.layer.4.bn1.weight", "block3.layer.4.bn1.bias", "block3.layer.4.bn1.running_mean", "block3.layer.4.bn1.running_var", "block3.layer.4.bn1.num_batches_tracked", "block3.layer.4.conv1.weight", "block3.layer.4.bn2.weight", "block3.layer.4.bn2.bias", "block3.layer.4.bn2.running_mean", "block3.layer.4.bn2.running_var", "block3.layer.4.bn2.num_batches_tracked", "block3.layer.4.conv2.weight", "bn1.weight", "bn1.bias", "bn1.running_mean", "bn1.running_var", "bn1.num_batches_tracked", "fc.weight", "fc.bias".

I want to test the epoch from 1 to 100, same issue for your default evaluation epoch range. Here is checkpoint files, model-path-parameter is for directory.
image

How can I resolve the issue?

Thanks & Regards!
Momo

tabular data/ noisy instances

Hi,
thanks for sharing your implementation. I have two questions about it:

  1. Does it also work on tabular data?
  2. Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

Error when loading pretrained checkpoint.

I use your code models/wideresnet.py to load the given checkpoint model-wideres-epoch78.pth. But it raised the error:

untimeError: Error(s) in loading state_dict for DataParallel: Unexpected key(s) in state_dict: "module.sub_block1.layer.0.bn1.weight", "module.sub_block1.layer.0.bn1.bias", "module.sub_block1.layer.0.bn1.running_mean", "module.sub_block1.layer.0.bn1.running_var", "module.sub_block1.layer.0.bn1.num_batches_tracked", "module.sub_block1.layer.0.conv1.weight", "module.sub_block1.layer.0.bn2.weight", "module.sub_block1.layer.0.bn2.bias", "module.sub_block1.layer.0.bn2.running_mean", "module.sub_block1.layer.0.bn2.running_var", "module.sub_block1.layer.0.bn2.num_batches_tracked", "module.sub_block1.layer.0.conv2.weight", "module.sub_block1.layer.0.convShortcut.weight", "modul
e.sub_block1.layer.1.bn1.weight", "module.sub_block1.layer.1.bn1.bias", "module.sub_block1.layer.1.bn1.running_mean", "module.sub_block1.layer.1.bn1.running_var", "module.sub_block1.layer.1.bn1.num_batches_tr
acked", "module.sub_block1.layer.1.conv1.weight", "module.sub_block1.layer.1.bn2.weight", "module.sub_block1.layer.1.bn2.bias", "module.sub_block1.layer.1.bn2.running_mean", "module.sub_block1.layer.1.bn2.run
ning_var", "module.sub_block1.layer.1.bn2.num_batches_tracked", "module.sub_block1.layer.1.conv2.weight", "module.sub_block1.layer.2.bn1.weight", "module.sub_block1.layer.2.bn1.bias", "module.sub_block1.layer
.2.bn1.running_mean", "module.sub_block1.layer.2.bn1.running_var", "module.sub_block1.layer.2.bn1.num_batches_tracked", "module.sub_block1.layer.2.conv1.weight", "module.sub_block1.layer.2.bn2.weight", "modul
e.sub_block1.layer.2.bn2.bias", "module.sub_block1.layer.2.bn2.running_mean", "module.sub_block1.layer.2.bn2.running_var", "module.sub_block1.layer.2.bn2.num_batches_tracked", "module.sub_block1.layer.2.conv2
.weight", "module.sub_block1.layer.3.bn1.weight", "module.sub_block1.layer.3.bn1.bias", "module.sub_block1.layer.3.bn1.running_mean", "module.sub_block1.layer.3.bn1.running_var", "module.sub_block1.layer.3.bn
1.num_batches_tracked", "module.sub_block1.layer.3.conv1.weight", "module.sub_block1.layer.3.bn2.weight", "module.sub_block1.layer.3.bn2.bias", "module.sub_block1.layer.3.bn2.running_mean", "module.sub_block1
.layer.3.bn2.running_var", "module.sub_block1.layer.3.bn2.num_batches_tracked", "module.sub_block1.layer.3.conv2.weight", "module.sub_block1.layer.4.bn1.weight"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.