Git Product home page Git Product logo

Comments (4)

zqwerty avatar zqwerty commented on June 1, 2024

Thanks for the question!

In general, you should use ConvLab/convlab/modules/nlg/multiwoz/evaluate.py.

The way we calculate bleu is different from machine translation task. We group the sentence by their dialog act. For example, if Inform-Hotel-Addr dialog act has 3 golden sentences [r1,r2,r3], then for each generated sentence, its reference sentences is [r1,r2,r3].

The differences between ConvLab/convlab/modules/nlg/multiwoz/evaluate.py and ConvLab/convlab/modules/nlg/multiwoz/sc_lstm/bleu.py:

  • The former replaces value in the sentence with corresponding dialog_act-slot, while the latter uses the delexicalized output (see sclstm.res) of sclstm directly. So the former one can be used for other NLG models.
  • The former generates one sentence each time(beam_size=1), which is slower compared with using batch.

from convlab.

ToSev7en avatar ToSev7en commented on June 1, 2024

@zqwerty Thank you !

It seems that I understand the way you group the sentence by dialog act. By this way, a system response generated by one dialog act (cause the beam_size=1 as you said ) may have multiple refs.

Comparing to another way that calculates a system response generated by one dialog act with it's only one golden system response, the way used in Convlab may get a higher BLEU4 score, right ?

And one more question: In ConvLab/convlab/modules/nlg/multiwoz/evaluate.py, the SCLSTM model is load from a remote source.

print("Loading", model_name)
if model_name == 'SCLSTM':
    model_sys = SCLSTM(model_file="https://convlab.blob.core.windows.net/models/nlg-sclstm-multiwoz.zip")

And in config.cfg, it shows that the model was only trained and evaluated on Boo_ResDataSplitRand0925.json rather than train.json. Would this situation be a problem?

[DATA]
vocab_file =	%(dir)s/resource/vocab.txt
feat_file =		%(dir)s/resource/feat.json
text_file =		%(dir)s/resource/text.json
template_file =	%(dir)s/resource/template.txt
dataSplit_file= %(dir)s/resource/Boo_ResDataSplitRand0925.json
batch_size = 256
shuffle = true
dir = 

[MODEL]
dec_type = sclstm
hidden_size = 100
dropout = 0.25
clip = 0.5
learning_rate = 0.001

[TRAINING]
model_epoch = best
n_epochs = 75

from convlab.

zqwerty avatar zqwerty commented on June 1, 2024

Comparing to another way that calculates a system response generated by one dialog act with it's only one golden system response, the way used in Convlab may get a higher BLEU4 score, right ?

Yes.

from convlab.

truthless11 avatar truthless11 commented on June 1, 2024

And in config.cfg, it shows that the model was only trained and evaluated on Boo_ResDataSplitRand0925.json rather than train.json. Would this situation be a problem?

We migrate SCLSTM from multiwoz benchmark where Boo_ResDataSplitRand0925.json is used and this json file split the dataset into train/valid/test set in the same way as Convlab that uses valListFile.json and testListFile.json in the original dataset.

from convlab.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.