Comments (12)
i very want to help you, but i only have a 1080ti 12g and don't know how change code to get BLEU score.
sorry.
from pointer_summarizer.
You can compare the rouge score too. I used 1070 with 8 gb and it took 3 days to train for 500k iteration. On 1080 ti it must be faster.
from pointer_summarizer.
You can compare the rouge score too. I used 1070 with 8 gb and it took 3 days to train for 500k iteration. On 1080 ti it must be faster.
I have finished the test of 100K and am now doing another test of 500K.
from pointer_summarizer.
Thats great. One more option would be to train for 700k make checkpoint every 50k and verify which checkpoint give best result.
from pointer_summarizer.
Thats great. One more option would be to train for 700k make checkpoint every 50k and verify which checkpoint give best result.
I've done 288K/500K, and I will start the 700k test in 2 days. So I'm going to upload it to DropBox, or you can select one to me.
from pointer_summarizer.
You don't need to upload the model. You can just report the rouge score.
from pointer_summarizer.
You don't need to upload the model. You can just report the rouge score.
ok.
from pointer_summarizer.
@atulkum did you try this model on some external data? like how do you convert just a csv file of text data to bin format. And could you upload pretrianed weight as well?? @pengzhi123
from pointer_summarizer.
I'm sorry for uploading data now. Our machine is broken, and I only trained to 660K. The following is the experimental result:
100k (batch size 8):
ROUGE-1:
rouge_1_f_score: 0.3420 with confidence interval (0.3397, 0.3443)
rouge_1_recall: 0.3830 with confidence interval (0.3803, 0.3856)
rouge_1_precision: 0.3288 with confidence interval (0.3263, 0.3312)
ROUGE-2:
rouge_2_f_score: 0.1401 with confidence interval (0.1382, 0.1420)
rouge_2_recall: 0.1568 with confidence interval (0.1545, 0.1590)
rouge_2_precision: 0.1350 with confidence interval (0.1331, 0.1369)
ROUGE-l:
rouge_l_f_score: 0.3105 with confidence interval (0.3083, 0.3126)
rouge_l_recall: 0.3475 with confidence interval (0.3448, 0.3500)
rouge_l_precision: 0.2987 with confidence interval (0.2964, 0.3010)
500k (batch size 8):
rouge_1_f_score: 0.3603 with confidence interval (0.3580, 0.3624)
rouge_1_recall: 0.4006 with confidence interval (0.3980, 0.4032)
rouge_1_precision: 0.3475 with confidence interval (0.3449, 0.3500)
ROUGE-2:
rouge_2_f_score: 0.1538 with confidence interval (0.1515, 0.1560)
rouge_2_recall: 0.1703 with confidence interval (0.1679, 0.1727)
rouge_2_precision: 0.1492 with confidence interval (0.1469, 0.1514)
ROUGE-l:
rouge_l_f_score: 0.3292 with confidence interval (0.3270, 0.3313)
rouge_l_recall: 0.3659 with confidence interval (0.3633, 0.3684)
rouge_l_precision: 0.3177 with confidence interval (0.3153, 0.3202)
from pointer_summarizer.
Thanks for doing this. Did you enabled coverage loss for this result?
from pointer_summarizer.
https://www.dropbox.com/s/czed99yiqjo34f3/training_log?dl=0
from pointer_summarizer.
@pengzhi123 Hi there, Can I ask what machine u r running it on? Seems really fast
from pointer_summarizer.
Related Issues (20)
- Python3 support? HOT 13
- During the training and verification process, when "step = 0", the "coverage" is initialized differently. During training, the coverage is an all-zero tensor, but this is not the case during prediction. HOT 1
- url correction HOT 1
- What is the version of tensorflow? HOT 1
- Test time custom decoding!!
- Training saturates early? HOT 3
- Vector encode input extend vocab HOT 3
- what's function of the eval.py when i check the train.py ,it does't call the eval.py , save the model directly? HOT 1
- question about eval HOT 1
- eval.py decode.py HOT 2
- when i train it with coverage ,the loss is nan when i get 250k iter? HOT 1
- how to use valid dataset to select a bestmodel to test? HOT 1
- How to train with Coverage? HOT 1
- 'Encoder' object has no attribute 'tx_proj' HOT 3
- What is the specific implementation of pointer network HOT 1
- Can the code here be trained with multiple GPUs HOT 1
- Discrepancy with implementation and the paper
- Retraining model cause optimizer duplicate parameter error HOT 3
- How to choose the best training model
- Duplicated computation with LSTM?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pointer_summarizer.