Comments (10)
For DirectProbe, we do not need to differentiate the training or test set. What DirectProbe do is it takes into a labeled dataset and produces a set of clusters. It does not know about training or test set.
In Section 4.1, we apply DirectProbe to the training and test sets to show how fine-tuning changes the geometry of the embeddings, i.e., fine-tuning diverges the training and test set.
I hope that clarifies your questions.
Thanks!
from bert-fine-tuning-analysis.
Thanks! Now I understand that line.
Then train.txt and test.txt in config file are probed separately?
entities_path = ${common}/entities/train.txt
test_entities_path = ${common}/entities/test.txt
But when I runned the code with train.txt and test.txt only one per result text files are saved like below.
Also I found that I should set both train and test files.
How can I interpret the results? Or should I set both train and test files as the same one?
from bert-fine-tuning-analysis.
Oh, I found that the current codes actually probe only entities_path and embeddings_path and do not probe test files, right?
from bert-fine-tuning-analysis.
Yes. You are correct. Every time, DirectProbe only clusters for one dataset. That test_entities_path is something from the previous version. We do not use it in the paper "A Closer Look at How FIne-tuning Changes BERT."
from bert-fine-tuning-analysis.
By the way, You may want to pull the latest version. We recently fixed a minor bug in the code.
from bert-fine-tuning-analysis.
Thanks! Now I understand that line. Then train.txt and test.txt in config file are probed separately?
entities_path = ${common}/entities/train.txt test_entities_path = ${common}/entities/test.txt
But when I runned the code with train.txt and test.txt only one per result text files are saved like below. Also I found that I should set both train and test files. How can I interpret the results? Or should I set both train and test files as the same one?
You need to provide something for the test_entities_path
, but it will not be used. The output is the results of entities_path
.
from bert-fine-tuning-analysis.
Thanks ! I will update the codes to the latest version.
Thank you for your clear comments. :)
from bert-fine-tuning-analysis.
@flyaway1217
Hi, I have one more question. When I run the codes with my own data which has 4 labels and about 7600 entities, it takes too much time (up to 4-5 hours) with the error below.
UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak. "timeout or by a memory leak.", UserWarning
Do you have any solution to this problem? Or should I just wait till the end of the clustering?
from bert-fine-tuning-analysis.
Sometimes I had the same warning. Usually, I just wait until the end.
If Directprobe takes a long time to finish, your representation is non-linear for the given task.
One thing you could try is to change rate
in the config.ini
file. It controls the size of the step during the clustering process. I suggest that you can try between [0.05, 0.2]
.
In my own experiments, we don't have many non-linear cases. But I can say that 4-5 hours is within the normal range.
from bert-fine-tuning-analysis.
Also, the time depends on how many CPU you have because Directprobe uses multiple processes to do the linearity check. More CPU usually means faster clustering.
from bert-fine-tuning-analysis.
Related Issues (1)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bert-fine-tuning-analysis.