The cmrc2018 from fsudong

CMRC2018

Task Discription

This year we will focus on the Span-Extraction Machine Reading Comprehension, which is a further extension of the blank-reading reading comprehension task. Although reading comprehension data sets such as Stanford SQuAD and NewsQA are used in English reading comprehension studies, relevant Chinese resources are still blank. The current Chinese machine reading comprehension evaluation will open the first chapter of the Chinese text segment extracted reading comprehension data set, the contestants need to model the text, problems, and extract consecutive pieces from the text as the answer. This evaluation still takes the training set, development set open, test set hidden form to ensure the fairness of the evaluation.

Introduction

An implementation of QANet with PyTorch, using CMRC2018 dataset. And the chinese word2vec model follows the Chinese-Word-Vectors. Any contributions are welcome!

Usage

Install pytorch 0.4 for Python 3.6+
Run pip install -r requirements.txt to install python dependencies.
Run download.sh to download the dataset.
Download the model of pyltp and the chinese word2vec from baiducloud,following the instructions of pyltp,word2vec for chinese,and then decompress the model file into "./data/ltp" and "./data/word2vec"
Run python main.py --mode data to build tensors from the raw dataset.
Run python main.py --mode train to train the model. After training, log/model.pt will be generated.
Run python main.py --mode test to test an pretrained model. Default model file is log/model.pt

Hardware Usage

The task need pytorch GPU to train the model

Structure

preproc.py: downloads dataset and builds input tensors.

main.py: program entry; functions about training and testing.

models.py: QANet structure.

config.py: configurations.

Differences from the paper

The paper doesn't mention which activation function they used. I use relu.
I don't set the embedding of <UNK> trainable.
The connector between embedding layers and embedding encoders may be different from the implementation of Google, since the description in the paper is inconsistent (residual block can't be used because the dimensions of input and output are different) and they don't say how they implemented it.

TODO

Contributors

InitialBug: found two bugs: (1). positional encodings require gradients; (2). wrong weight sharing among encoders.
linthieda: fixed one issue about dependencies and offered computing resources.

fsudong / cmrc2018 Goto Github PK

cmrc2018's Introduction

CMRC2018

Task Discription

Introduction

Usage

Hardware Usage

Structure

Differences from the paper

TODO

Contributors

cmrc2018's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent