Git Product home page Git Product logo

spi's Introduction

SPI - Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference

Code for the paper Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference. [PDF]

The paper will be presented at ICML 2023. This code has been written using PyTorch. If you use source codes in this repository in your work, please cite the following papers:

@inproceedings{pmlr-v202-xu23j,
	author = {Xu, Yan and Kong, Deqian and Xu, Dehong and Ji, Ziwei and Pang, Bo and Fung, Pascale and Wu, Ying Nian},
	booktitle = {Proceedings of the 40th International Conference on Machine Learning},
	editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
	month = {23--29 Jul},
	pages = {38518--38534},
	pdf = {https://proceedings.mlr.press/v202/xu23j/xu23j.pdf},
	publisher = {PMLR},
	series = {Proceedings of Machine Learning Research},
	title = {Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference},
	url = {https://proceedings.mlr.press/v202/xu23j.html},
	volume = {202},
	year = {2023},
	bdsk-url-1 = {https://proceedings.mlr.press/v202/xu23j.html}
}

Environment

Install the environment with the following command lines:

conda env create -f dial.yml python=3.10
conda activate dial-env

Data

To run both training and prediction of SPI, you need to prepare the data for experiments.

  1. Download datasets from the official links.

  2. Put the downloaded datasets under data folder, named as wizard_of_wikipedia and holle, respectively.

  3. Preprocess data by unwraping dialogues and converting the dialogues into data samples. Run the following command line under the main folder:

python src/data_utils/wow_proc.py --preproc_dir data/processed_wow
python src/data_utils/holle_proc.py --preproc_dir data/processed_holle
  1. The processed data will be stored under the preproc_dir. Do not modify the above path, or you will have to modify the hard-coded path in src/data_utils/wizard_of_wikipedia.py and src/data_utils/holle.py.

Alternatively, you can also download the pre-processed data directly from here.

Training

For reproducibility, you can access our pre-trained weights from here. You can also train the models yourself:

sh run_spi.sh

In run_spi.sh, the command line for training four different models is provided. Please use them based on your needs.

Prediction

Given one checkpoint, we can evaluate our model with the following command line. The script will 1) compute the perplexity of generating the gold responses and 2)generate responses given the data samples in the test set.

sh predict_spi.sh

Coming Soon

Our code without cleansing and pre-training model weights are available here tentatively.

spi's People

Contributors

deqiankong avatar yana-xuyan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

yana-xuyan

spi's Issues

路径的导入错误

作者你好,我同样是来自西安的一名学生,我在复现你的模型的时候,总是遇见各种路径的导入错误,请问一下您当时的电脑运行环境和操作系统,我已经按照要求导入了需要的包,可能有的版本没有完全一致,这样会影响出现这个无法导入模块的错误吗?期待您的回复和指点,我将不胜感激!
a336a464b7d885a63a1cfb878f7f2e8

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.