Comments (3)
@BaohaoLiao Hey, thanks for your question! For MNLI dataset, we choose the validation_matched split for validation and testing. (I will make this clear in the next revision. I think the RED paper was not clear either, so I figured this out by emailing the authors! I might also just describe what RED paper appendix says in the ReFT paper as well to make it self-contained about the validation setup and evaluation metric (whether use accuracy, correlation, etc..).)
To reproduce, here is an example script for RoBERTa-base
. For RoBERTa-large
, you can copy the hyperparameters from our appendix to reproduce:
python train.py -task glue \
-train_dataset mnli \
-model FacebookAI/roberta-base \
-seed 42 -l all -r 1 -p f1 -e 40 -lr 6e-4 \
-type LoreftIntervention \
-gradient_accumulation_steps 1 \
-batch_size 32 \
-eval_batch_size 32 \
-test_split validation_matched \
-max_length 256 \
--metric_for_best_model accuracy \
--dropout 0.05 \
--weight_decay 0.0000 \
--warmup_ratio 0.00 \
--logging_steps 20 \
--allow_cls_grad
Use the seeds {42,43,44,45,46}
. And for the validation set partition, please refer to our code for details. But basically, we partition a set from the validation set (random partition based on the seed) for selecting the best model, and report the final accuracy on the hold out set.
Please let me know if you have other questions! And feel free to close the ticket if you feel like your question is addressed.
Thanks for your interests!
from pyreft.
Also attaching GLUE benchmark description that will be added into the Appendix to provide more details. Please also see Appendix A.1 of the RED paper for the original implementation (I basically paraphrased their setup description, so credit goes to them).
from pyreft.
Thank you very much for your timely help.
from pyreft.
Related Issues (20)
- [P1] I am bit confused how to reproduce Table 2 (all baselines + main method) HOT 3
- [P0] Adding DPO Support HOT 8
- [P0] Why is the number of trainable parameters for prefix-tuning is 0.11% HOT 7
- [P1] TypeError: Object of type type is not JSON serializable HOT 6
- [P2] Pyreft tensorboard integration
- [P1] Location of code for "LM training and serving with ReFT" HOT 2
- [P0] compreft.ipynb error = KeyError: 'subspaces' HOT 4
- [P1] Confirmation of alpaca_eval version HOT 4
- [P1] Intuitive-wise, should we keep the projection orthogonal during training? HOT 2
- [P1] catastrophic forgetting HOT 1
- [P1] RuntimeError: cutlassF: no kernel found to launch! HOT 4
- Getting issue while loading Phi3 in reft_model HOT 6
- [P1] How did you create the validation set for Commonsense reasoning hyperparameter tuning? HOT 5
- [P0] Additional intervention arguments are not saved correctly, e.g. `add_bias`
- [P1] Getting error as IntervenableModel.train() takes 1 positional argument but 2 were given HOT 2
- [P1] Convert reft model to hf model HOT 1
- [P0] Make `make_last_position_supervised_data_module` parallelizable to speed up processing! HOT 2
- [P1] Loading REFT fro RoBERTa Models HOT 3
- [P1] TypeError: train() takes 1 positional argument but 2 were given HOT 1
- [P1] Loreft example gsm8k train gives: RuntimeError: output with shape [64, 1, 7] doesn't match the broadcast shape [64, 0, 7] HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyreft.