Comments (4)
Thanks for the question!
In general, you should use ConvLab/convlab/modules/nlg/multiwoz/evaluate.py
.
The way we calculate bleu is different from machine translation task. We group the sentence by their dialog act. For example, if Inform-Hotel-Addr
dialog act has 3 golden sentences [r1,r2,r3], then for each generated sentence, its reference sentences is [r1,r2,r3].
The differences between ConvLab/convlab/modules/nlg/multiwoz/evaluate.py
and ConvLab/convlab/modules/nlg/multiwoz/sc_lstm/bleu.py
:
- The former replaces
value
in the sentence with correspondingdialog_act-slot
, while the latter uses the delexicalized output (seesclstm.res
) of sclstm directly. So the former one can be used for other NLG models. - The former generates one sentence each time(beam_size=1), which is slower compared with using batch.
from convlab.
@zqwerty Thank you !
It seems that I understand the way you group the sentence by dialog act. By this way, a system response generated by one dialog act (cause the beam_size=1 as you said ) may have multiple refs.
Comparing to another way that calculates a system response generated by one dialog act with it's only one golden system response, the way used in Convlab may get a higher BLEU4 score, right ?
And one more question: In ConvLab/convlab/modules/nlg/multiwoz/evaluate.py
, the SCLSTM model is load from a remote source.
print("Loading", model_name)
if model_name == 'SCLSTM':
model_sys = SCLSTM(model_file="https://convlab.blob.core.windows.net/models/nlg-sclstm-multiwoz.zip")
And in config.cfg
, it shows that the model was only trained and evaluated on Boo_ResDataSplitRand0925.json
rather than train.json
. Would this situation be a problem?
[DATA]
vocab_file = %(dir)s/resource/vocab.txt
feat_file = %(dir)s/resource/feat.json
text_file = %(dir)s/resource/text.json
template_file = %(dir)s/resource/template.txt
dataSplit_file= %(dir)s/resource/Boo_ResDataSplitRand0925.json
batch_size = 256
shuffle = true
dir =
[MODEL]
dec_type = sclstm
hidden_size = 100
dropout = 0.25
clip = 0.5
learning_rate = 0.001
[TRAINING]
model_epoch = best
n_epochs = 75
from convlab.
Comparing to another way that calculates a system response generated by one dialog act with it's only one golden system response, the way used in Convlab may get a higher BLEU4 score, right ?
Yes.
from convlab.
And in config.cfg, it shows that the model was only trained and evaluated on Boo_ResDataSplitRand0925.json rather than train.json. Would this situation be a problem?
We migrate SCLSTM from multiwoz benchmark where Boo_ResDataSplitRand0925.json
is used and this json file split the dataset into train/valid/test set in the same way as Convlab that uses valListFile.json
and testListFile.json
in the original dataset.
from convlab.
Related Issues (20)
- Issue with the Demo Policies HOT 1
- Plot the learning curve. HOT 1
- NameError: name 'false' is not defined HOT 2
- About DQN in Convlab2
- about DQN in Convlab2 HOT 1
- Is
- Is there any method to speed up the training process? HOT 7
- Where should be the right place I set the seed? HOT 8
- Question: What is the meaning of max_session and max_trial? HOT 5
- Question: Actor Critic with Experience Replay HOT 1
- Question: Adding new domain HOT 3
- How to train a DQN with no DST(rule)
- How to train an agent in Movie ticket booking dataset?
- Data update HOT 1
- Where is data/value_set.json from?
- Where is data/value_set.json from? HOT 1
- MultiWoz 2.1 HOT 3
- How to train an agent with new dataset HOT 2
- Some conflict in requirements.txt =_=|| HOT 2
- Columns and DataType Not Explicitly Set on line 154 of analysis.py
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from convlab.