Comments (12)
Ok, thanks!
I'll try testing something like this and will report back.
from allennlp.
Hey @pvcastro, a couple questions:
- In all experiments (BERT-AllenNLP, RoBERTa-AllenNLP, BERT-transformers, RoBERTa-transformers) were you using the same optimizer?
- When you used transformers directly (for BERT-transformers and RoBERTa-transformers) was that a CRF model as well, or was that just using the
(Ro|B)ertaForSequenceClassification
models?
from allennlp.
Hi @epwalsh , thanks for the feedback!
- Yes, I was using the huggingface_adamw optimizer.
- No, it wasn't an adaptation with CRF, I used the straight run_ner script script from the hf's examples. But I believe the CRF layer would only improve results, as they usually do with bert models.
from allennlp.
Gotcha! Oh yes, I meant BertForTokenClassification
, not BertForSequenceClassification
π€¦
So I think the most likely source for a bug would be in the PretrainedTransformerMismatched(Embedder|TokenIndexer)
. And any differences between BERT and RoBERTa would probably have to do with tokenization. See, for example:
allennlp/allennlp/data/tokenizers/pretrained_transformer_tokenizer.py
Lines 295 to 311 in 8571d93
from allennlp.
I was assuming that just running some unit tests from the AllenNLP repository, to confirm that these embedders/tokenizers are producing tokens with the same special tokens as RoBERTa architecture would be enough to discard these. I ran some tests using RoBERTa and confirmed that it's not relying on CLS. Was this too superficial to reach any conclusions?
from allennlp.
I'm not sure. I mean, I thought we did have pretty good test coverage there, but I know for a fact that's one of the most brittle pieces of code in the whole library. It would break all of the time with new releases of transformers
. So that's my best guess.
from allennlp.
Do you think it makes sense for me to run additional tests for the embedder comparing embeddings produced by a raw RobertaModel and the actual PretrainedTransformerMismatchedEmbedder? To try to see if they are somehow getting "corrupted" in the framework.
from allennlp.
I guess I would start by looking very closely at the exact tokens that are being used for each word by the PretrainedTransformerMismatchedEmbedder
. Maybe pick out a couple test instances to check where the performance gap between the BERT and RoBERTa predictions is largest.
from allennlp.
This issue is being closed due to lack of activity. If you think it still needs to be addressed, please comment on this thread π
from allennlp.
This issue is being closed due to lack of activity. If you think it still needs to be addressed, please comment on this thread π
from allennlp.
Sorry, I'll try to get back to this next week, haven't had the time yet π
from allennlp.
No rush, I thought adding the "question" label would stop @github-actions bot from closing this, but I guess not.
from allennlp.
Related Issues (20)
- The DataLoader Needs to Handle Dirty Examples. HOT 8
- error message occuied βzipfile.BadZipFile: File is not a zip fileβ HOT 3
- will update to support latest pytorch? HOT 9
- Rich 12.1.0 has been yanked, but has been pinned in `requirements.txt` HOT 1
- Incompatibile packages HOT 2
- Unclear how to use text2sql model HOT 5
- Can't load models with .zip extension HOT 2
- AllenNLP-Light! π π HOT 2
- Is it possible to load my own quantized model from local HOT 3
- Questions about start training from checkpoint using --recover HOT 1
- Is it possible to load my own quantized model from local HOT 9
- SRL BERT performing poorly for german dataset HOT 1
- Remove upper bounds for requirements HOT 1
- Alternative semantic role labeling model HOT 3
- AutoTokenizer config error when load clipmodel HOT 2
- When 'instances_per_epoch' is set up in the class MultiTaskDataLoader, the function __len__ in it will return a wrong answer. HOT 1
- New version with upper bounds on dependencies removed HOT 2
- Incomplete model_state_epoch files HOT 1
- allennlp.common.checks.ConfigurationError: key "token_embedders" is required at location "model.text_field_embedder." HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from allennlp.