Comments (2)
Hey, thanks for trying out!
We did implement the modified dropout in the code for both fairseq and Huggingface Transformers. Most of the modification are on the modeling part. For an easier reference, see the lines involving the variable dp_mask
in huggingface-transformers/src/transformers/modeling_albert.py.
For a comparison of the results with and without such a modification, please refer to Table 4 in the paper.
from freelb.
Wow. Thanks for your prompt reply! I get it.
I use the Vanilla transformers framework and feel confused about the "dp_mask" parameter. It does play a role in your revised ”Albert“ version, which is used to control the dropout mask used in the forwarding step. Thanks. Maybe We need a balance between performance and code modification (especially for Vanilla transformers framework, e.g. for other Bert-variant models, do the same modification as your "Albert") when using the same dropout suggestion.
from freelb.
Related Issues (18)
- Could you add some comments in the code? HOT 1
- FreeLB didn't use the original training samples? HOT 2
- Reproducing results from the paper with roberta using fairseq HOT 5
- Having issues with training RoBERTa. Loss not decreasing HOT 2
- the "dp_mask" problem HOT 4
- API Key HOT 1
- 词向量空间的不变性
- Errors generated during data preprocessing HOT 2
- Some confusion about the detach operation and embeds_init
- NaN encounted if FreeLB is used at the beginning of finetune stage
- FreeLB-RoBERTa within HuggingFace's transformers? HOT 1
- 'AlbertForSequenceClassification' object has no attribute 'encoder' HOT 2
- Is it still working with update_freq > 1? HOT 2
- Regarding the release of FreeLB ^_^
- ImportError: cannot import name 'glue_criterion_metrics' from 'transformers' HOT 6
- Does anyone meet the Nan error during the end epochs of training? HOT 5
- Would you please release the hyper-parameters for FreeLB based on ALBERT(hugging-face) HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from freelb.