Git Product home page Git Product logo

looperxx / agif Goto Github PK

View Code? Open in Web Editor NEW
78.0 4.0 31.0 54.8 MB

Open source code for EMNLP 2020 Findings Paper "AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling"

License: GNU General Public License v2.0

Python 83.10% Perl 16.90%
spoken-language-understanding intent-detection dialogue-systems task-oriented-dialogue end-to-end multi-intent slot-filling

agif's Introduction

AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling

This repository contains the official PyTorch implementation of the paper:

AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling. Libo Qin, Xiao Xu, Wanxiang Che, Ting Liu. EMNLP 2020 Accept-Findings. [Paper(Arxiv)] [Paper]

If you use any source codes or the datasets included in this toolkit in your work, please cite the following paper. The bibtex are listed below:

@inproceedings{qin-etal-2020-agif,
    title = "{AGIF}: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling",
    author = "Qin, Libo  and
      Xu, Xiao  and
      Che, Wanxiang  and
      Liu, Ting",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.findings-emnlp.163",
    pages = "1807--1816",
    abstract = "In real-world scenarios, users usually have multiple intents in the same utterance. Unfortunately, most spoken language understanding (SLU) models either mainly focused on the single intent scenario, or simply incorporated an overall intent context vector for all tokens, ignoring the fine-grained multiple intents information integration for token-level slot prediction. In this paper, we propose an Adaptive Graph-Interactive Framework (AGIF) for joint multiple intent detection and slot filling, where we introduce an intent-slot graph interaction layer to model the strong correlation between the slot and intents. Such an interaction layer is applied to each token adaptively, which has the advantage to automatically extract the relevant intents information, making a fine-grained intent information integration for the token-level slot prediction. Experimental results on three multi-intent datasets show that our framework obtains substantial improvement and achieves the state-of-the-art performance. In addition, our framework achieves new state-of-the-art performance on two single-intent datasets.",
}

example

In the following, we will guide you how to use this repository step by step.

Architecture

framework

Results

result_multi

result_single

Tips: We find some repeated sentences in the MixATIS and MixSNIPS datasets so that we clean these two datasets and name them MixATIS_clean and MixSNIPS_clean.

There are [13162, 759, 828] utterances for training, validation and testing in the MixATIS_clean and [39776, 2198, 2199] in the MixSNIPS_clean.

We recommend using the cleaned version datasets. We rerun all the experiments and the results are as follows:

result_multi_clean

Preparation

Our code is based on PyTorch 1.2 Required python packages:

  • numpy==1.18.1
  • tqdm==4.32.1
  • pytorch==1.2.0
  • python==3.7.3
  • cudatoolkit==9.2

We highly suggest you using Anaconda to manage your python environment.

How to Run it

The script train.py acts as a main function to the project, you can run the experiments by the following commands.

# MixATIS dataset
python train.py -g -bs=16 -ne=100 -dd=./data/MixATIS -lod=./log/MixATIS -sd=./save/MixATIS -nh=4 -wed=32 -sed=128 -ied=64 -sdhd=64 -dghd=64 -ln=MixATIS.txt

# MixSNIPS dataset
python train.py -g -bs=64 -ne=50 -dd=./data/MixSNIPS -lod=./log/MixSNIPS -sd=./save/MixSNIPS -nh=8 -wed=32 -ied=64 -sdhd=64 -ln=MixSNIPS.txt

# ATIS dataset
python train.py -g -bs=16 -ne=300 -dd=./data/ATIS -lod=./log/ATIS -sd=./save/ATIS -nh=4 -wed=64 -ied=128 -sdhd=128 -ln=ATIS.txt

# SNIPS dataset
python train.py -g -bs=16 -ne=200 -dd=./data/SNIPS -lod=./log/SNIPS -sd=./save/SNIPS -nh=8 -wed=64 -ied=64 -sdhd=64 -ln=SNIPS.txt 

We also provide our reported model parameters in the save/best directory, you can run the following command to evaluate them and so on.

# MixATIS dataset
python train.py -g -bs=16 -ne=0 -dd=./data/MixATIS -lod=./log/MixATIS -sd=./save/best/MixATIS -ld=./save/best/MixATIS -nh=4 -wed=32 -sed=128 -ied=64 -sdhd=64 -dghd=64 -ln=MixATIS.txt

# MixSNIPS dataset
python train.py -g -bs=64 -ne=0 -dd=./data/MixSNIPS -lod=./log/MixSNIPS -sd=./save/best/MixSNIPS -ld=./save/best/MixSNIPS -nh=8 -wed=32 -ied=64 -sdhd=64 -ln=MixSNIPS.txt

# ATIS dataset
python train.py -g -bs=16 -ne=0 -dd=./data/ATIS -lod=./log/ATIS -sd=./save/best/ATIS -ld=./save/best/ATIS -nh=4 -wed=64 -ied=128 -sdhd=128 -ln=ATIS.txt

# SNIPS dataset
python train.py -g -bs=16 -ne=0 -dd=./data/SNIPS -lod=./log/SNIPS -sd=./save/best/SNIPS -ld=./save/best/SNIPS -nh=8 -wed=64 -ied=64 -sdhd=64 -ln=SNIPS.txt 

Due to some stochastic factors(e.g., GPU and environment), it maybe need to slightly tune the hyper-parameters using grid search to reproduce the results reported in our paper. All the hyper-parameters are in the utils/config.py and here are the suggested hyper-parameter settings:

  • Number of attention heads [4, 8]
  • Intent Embedding Dim [64, 128]
  • Word Embedding Dim [32, 64]
  • Slot Embedding Dim [32, 64, 128]
  • Decoder Gat Hidden Dim [16, 32, 64]
  • Batch size [16, 32, 64]
  • Intent Embedding Dim must equal to Slot Decoder Hidden Dim

P.S. We just slightly tune the hyper-parameters.

If you have any question, please issue the project or email me or lbqin and we will reply you soon.

Acknowledgement

A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding. Libo Qin, Wanxiang Che, Yangming Li, Haoyang Wen and Ting Liu. (EMNLP 2019). Long paper. [pdf] [code]

We are highly grateful for the public code of Stack-Propagation!

agif's People

Contributors

looperxx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

agif's Issues

参数alpha和论文是不是有违背?

random_slot, random_intent = random.random(), random.random() if random_slot < self.__dataset.slot_forcing_rate: slot_out, intent_out = self.__model(text_var, seq_lens, forced_slot=slot_var) else: slot_out, intent_out = self.__model(text_var, seq_lens)
            slot_var = torch.cat([slot_var[i][:seq_lens[i]] for i in range(0, len(seq_lens))], dim=0)                slot_loss = self.__criterion(slot_out, slot_var)                intent_loss = self.__criterion_intent(intent_out, intent_var)                batch_loss = slot_loss + intent_los<!--EndFragment-->

代码的实现是不是和论文里面的alpha*L1 + (1-alpha)*L2相违背了?
为什么实现要这样做?直接按系数乘效果不好是吗?

LICENSE file

Would it be possible to add a LICENSE file to the repository? I think it is a piece of code worth making available for other parties to build on.

mixatis数据类别不统一

你好,我下载数据后发现MixATIS测试数据中有atis_day_name类别的数据,但训练数据中没有,这个是数据本身就这样吗?

Prediction script

Can someone please help me to create a prediction script for a new text sample? I want to load a trained model and make a prediction on a newly provided text.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.