ai-secure / infobert Goto Github PK

[ICLR 2021] "InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective" by Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Python 98.02% Shell 1.98%

bert language-models adversarial-attacks adversarial-defense adversarial-robustness roberta information-theory

infobert's Introduction

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

This is the official code base for our ICLR 2021 paper:

"InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective".

Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Usage

Prepare your environment

Download required packages

pip install -r requirements.txt

ANLI and TextFooler

To run ANLI and TextFooler experiments, refer to README in the ANLI directory.

SQuAD

To run SQuAD experiments, refer to README in the SQuAD directory.

Citation

@inproceedings{
wang2021infobert,
title={InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective},
author={Wang, Boxin and Wang, Shuohang and Cheng, Yu and Gan, Zhe and Jia, Ruoxi and Li, Bo and Liu, Jingjing},
booktitle={International Conference on Learning Representations},
year={2021}}

infobert's People

Contributors

Stargazers

Watchers

Forkers

wind91725 dumpmemory li-ming-fan ishanisri harel-coffee quantumtechniker

infobert's Issues

Runtime Error - "Distributed package doesn't have NCCL" when running runexp

When I run source setup.sh && runexp anli-full infobert roberta-large 2e-5 32 128 -1 1000 42 1e-5 5e-3 6 0.1 0 4e-2 8e-2 0 3 5e-3 0.5 0.9 as specified in the README in the ANLI directory, I encounter a RuntimeError: Distributed package doesn't have NCCL built in message.

Do you have any advice on how to fix this?

Thank you! I really admire your work.

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

Hi, I'm trying to run the following command
source setup.sh && runexp anli-part infobert roberta-base 2e-5 32 128 -1 1000 42 1e-5 5e-3 6 0.1 0 4e-2 8e-2 0 3 5e-3 0.5 0.9
But I got the following error.
Traceback:

04/08/2022 19:30:17 - INFO - datasets.anli -   Saving features into cached file anli_data/cached_dev_RobertaTokenizer_128_anli-part [took 0.690 s]
04/08/2022 19:30:17 - INFO - filelock -   Lock 139893720074960 released on anli_data/cached_dev_RobertaTokenizer_128_anli-part.lock
04/08/2022 19:30:17 - INFO - local_robust_trainer -   You are instantiating a Trainer but W&B is not installed. To use wandb logging, run `pip install wandb; wandb login` see https://docs.wandb.com/huggingface.
04/08/2022 19:30:17 - INFO - local_robust_trainer -   ***** Running training *****
04/08/2022 19:30:17 - INFO - local_robust_trainer -     Num examples = 942069
04/08/2022 19:30:17 - INFO - local_robust_trainer -     Num Epochs = 3
04/08/2022 19:30:17 - INFO - local_robust_trainer -     Instantaneous batch size per device = 32
04/08/2022 19:30:17 - INFO - local_robust_trainer -     Total train batch size (w. parallel, distributed & accumulation) = 32
04/08/2022 19:30:17 - INFO - local_robust_trainer -     Gradient Accumulation steps = 1
04/08/2022 19:30:17 - INFO - local_robust_trainer -     Total optimization steps = 88320
Iteration:   0%|                                                                                                                                                                                                                                                                                   | 0/29440 [00:00<?, ?it/s]
Epoch:   0%|                                                                                                                                                                                                                                                                                           | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "./run_anli.py", line 395, in <module>
    main()
  File "./run_anli.py", line 239, in main
    model_path=model_args.model_name_or_path if os.path.isdir(model_args.model_name_or_path) else None
  File "/root/InfoBERT/ANLI/local_robust_trainer.py", line 731, in train
    full_loss, loss_dict = self._adv_training_step(model, inputs, optimizer)
  File "/root/InfoBERT/ANLI/local_robust_trainer.py", line 1031, in _adv_training_step
    outputs = model(**inputs)
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 447, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/InfoBERT/ANLI/models/roberta.py", line 345, in forward
    inputs_embeds=inputs_embeds,
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/InfoBERT/ANLI/models/bert.py", line 822, in forward
    output_hidden_states=output_hidden_states,
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/InfoBERT/ANLI/models/bert.py", line 494, in forward
    output_attentions,
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/InfoBERT/ANLI/models/bert.py", line 416, in forward
    hidden_states, attention_mask, head_mask, output_attentions=output_attentions,
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/InfoBERT/ANLI/models/bert.py", line 347, in forward
    hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, output_attentions,
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/InfoBERT/ANLI/models/bert.py", line 239, in forward
    mixed_query_layer = self.query(hidden_states)
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
    return F.linear(input, self.weight, self.bias)
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/nn/functional.py", line 1372, in linear
    output = input.matmul(weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`
Traceback (most recent call last):
  File "/root/miniconda3/envs/infobert/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/root/miniconda3/envs/infobert/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in <module>
    main()
  File "/root/miniconda3/envs/infobert/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/root/miniconda3/envs/infobert/bin/python', '-u', './run_anli.py', '--local_rank=0', '--model_name_or_path', 'roberta-base', '--task_name', 'anli-part', '--do_train', '--do_eval', '--data_dir', 'anli_data', '--max_seq_length', '128', '--per_device_train_batch_size', '32', '--learning_rate', '2e-5', '--max_steps', '-1', '--warmup_steps', '1000', '--weight_decay', '1e-5', '--seed', '42', '--beta', '5e-3', '--logging_dir', 'infobert-roberta-base-anli-part-sl128-lr2e-5-bs32-ts-1-ws1000-wd1e-5-seed42-beta5e-3-alpha5e-3--cl0.5-ch0.9-alr4e-2-amag8e-2-anm0-as3-hdp0.1-adp0-version6', '--output_dir', 'infobert-roberta-base-anli-part-sl128-lr2e-5-bs32-ts-1-ws1000-wd1e-5-seed42-beta5e-3-alpha5e-3--cl0.5-ch0.9-alr4e-2-amag8e-2-anm0-as3-hdp0.1-adp0-version6', '--version', '6', '--evaluate_during_training', '--logging_steps', '500', '--save_steps', '500', '--hidden_dropout_prob', '0.1', '--attention_probs_dropout_prob', '0', '--overwrite_output_dir', '--adv_lr', '4e-2', '--adv_init_mag', '8e-2', '--adv_max_norm', '0', '--adv_steps', '3', '--alpha', '5e-3', '--cl', '0.5', '--ch', '0.9']' returned non-zero exit status 1.

Do you know how to fix this?
Thank you so much.

Other Information:

OS: Ubuntu 20.04.3 LTS
GPU: NVIDIA A100
Python 3.7.13

Training a Model from Huggingface with InfoBERT

Hi, I am attempting to use an ALBERT model, e.g. albert-base-v2 from https://huggingface.co/albert-base-v2, and usingalbert-base-v2 as the argument when I run source setup.sh && runexp.

Unfortunately, I am getting an AttributeError: 'AlbertForSequenceClassification' object has no attribute 'clear_mask'

Are there any functions I need to implement in order for this to work, or any advice about getting an approach like this to work with your code base?

Thank you!

Time taken to train the model from scratch

Hi, this is more of a question than an issue. How long does it take for the model to train approximately on a single GPU without batch size of 16?

Thanks

How are A1,A2, A3 scores calculated from r1,r2,r3 test results?

The results of evaluation show here in the README file vs the ones in the paper are based on different metrics. The paper shows A1,A2,A3 scores along with accuracy on adv-MNLi, adv-SNLi etc, whereas running the evaluation here shows us accuracy roundwise of both tests and dev data. Is there any formula to calculate the A1,A2,A3 scores from the test scores mentioned in the repo?

Thanks in advance for the help :)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.