Git Product home page Git Product logo

papers-for-text-summarization's Introduction

Papers-For-Graduation-Project

论文调研

文本摘要相关论文

1. LCSTS: A Large-Scale Chinese Short Text Summarization Dataset 论文地址

  1. 爬取并过滤了240W+条微博蓝V发布的[摘要,短文本],这是本系统所采用的语料。
  2. 本文提出了word-based和character-based(为了改善UNK问题)两种数据处理的方法,并给出了RNN和RNN+context两种模型做baseline。
  3. RNN+Context+Char组合表现最好,ROUGE-1:0.299 ROUGE-2:0.174 ROUGE-L:0.272
  • 本文的主要贡献就是提供短文本摘要的训练集,并给出了baseline。我之前用tfidf提关键句ROUGE-1达到了0.28,没干过他。感觉长文本和短文本摘要还是有一些区别的,可能短文本摘要要更注重句子压缩,长文本摘要更注重信息提取。暂时不做短文本了,如果毕设需要使用该数据集就回头再看。

2. The Automatic Creation of Lierature Abstracts 论文地址

  1. TFIDF计算关键词->通过关键词的密集程度计算关键句->通过关键句形成摘要

3. TextRank:Bringing Order into Texts 论文地址

  1. 使用Textrank方法提取文本中关键词/句

评价指标

1. ROUGE: A Package for Automatic Evaluation of Summaries 论文地址

  1. 一种自动评价摘要的方法,包括ROUGE-N、ROUGE-L、ROUGE-W、ROUGE-S、SOUGE-SU。
  • 目前最权威的自动摘要评价方法,网上给出的英文版居多。我实现了中文版的部分ROUGE,但并不权威。

papers-for-text-summarization's People

Contributors

yangzhiye avatar

Stargazers

Zhang Wenjing avatar zhaotl avatar  avatar CO9718 avatar 慎独。 avatar  avatar  avatar  avatar FungXG avatar  avatar  avatar king avatar  avatar  avatar  avatar Cao_enjun avatar 慢半拍 avatar Hyuk Lee avatar

Watchers

James Cloos avatar  avatar

papers-for-text-summarization's Issues

关于数据集

大佬有文本摘要的数据集吗?可否分享一下,谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.