Git Product home page Git Product logo

text_matching-1's Introduction

text_matching

文本匹配模型

本项目包含目前大部分文本匹配模型,持续更新中,其中论文解读请点击文本相似度,文本匹配模型归纳总结

数据集为QA_corpus,训练数据10w条,验证集和测试集均为1w条

其中对应模型文件夹下的args.py文件是超参数

训练: python train.py

测试: python test.py

词向量: 不同的模型输入不一样,有的模型的输入只有简单的字向量,有的模型换成了字向量+词向量,甚至还有静态词向量(训练过程中不进行更新)和 动态词向量(训练过程中更新词向量),所有不同形式的输入均以封装好,调用方法如下

静态词向量,请执行 python word2vec_gensim.py,该版本是采用gensim来训练词向量

动态词向量,请执行 python word2vec.py,该版本是采用tensorflow来训练词向量,训练完成后会保存embedding矩阵、词典和词向量在二维矩阵的相对位置的图片, 如果非win10环境,由于字体的原因图片可能保存失败

测试集结果对比:

模型 loss acc 输入说明 论文地址
DSSM 0.7613157 0.6864 字向量 DSSM
ConvNet 0.6872447 0.6977 字向量 ConvNet
ESIM 0.55444807 0.736 字向量 ESIM
ABCNN 0.5771452 0.7503 字向量 ABCNN
BiMPM 0.4852 0.764 字向量+静态词向量 BiMPM
DIIN 0.48298636 0.7694 字向量+动态词向量 DIIN
DRCN 0.6549849 0.7811 字向量+静态词向量+动态词向量+是否有相同词 DRCN

以上测试结果可能不是模型的最优解,超参的选择也不一定是最优的,如果你想用到自己的实际工程中,请自行调整超参

text_matching-1's People

Contributors

ishine avatar terrifyzhao avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.