Git Product home page Git Product logo

headline_generation_nlp's Introduction

Headline_Generation_NLP

Refer to the Report.pdf for detailed description

1. Problem Formation

Headline generation is within the category of the text summarization. In this project, we would like to modify the current way of headline generation and automatically generate headlines from the text of news articles. Since headlines are terse and convey the most important theme of the input text, it won’t be appropriate to just select a subset of actual sentences from the original text as a summary. Instead, it should be generated by building the semantic representation of the text to create a summary.

The model of news headline generation we are trying to improve in this project is the one proposed by Konstantin Lopyrev [1], which adopts an end-to-end encoder-decoder framework as well as utilizes attention mechanism. The encoder and the decoder are each a recurrent neural network [2]. The encoder encodes a source article into a sequence of latent vectors, and the decoder outputs a summary word by word based on the latent vectors. The attention mechanism allows the decoder to attend to different parts of the source.

2. Project Plans

We plan to use ELMo embedding [3] instead of GloVe embedding [4] to encode the input text. ELMo naturally captures the contextual information by training a large-scale bidierctional language model and is proved to have better performance on many supervised NLP tasks. In this project, we will use the pretrained model which can be obtained here

We also plan to implement a bidirectional RNN to preserve information from both directions. The original RNN model only takes into consideration the current and the previous words to decide values to assign the neurons, while using bidirectional RNN considers also the words that follow.

3. Dataset and Evaluation

The ideal dataset for this project would be the English Gigaword. It was used by Lopyrev, however we may need to deal with copyright issues first. The alternative dataset we use is All the news dataset [5] which contains 143 thousands articles from 15 American publications.

We will evaluate generated news headlines with BLEU [6]. In general, BLEU measures how much the words in the machine-generated headlines appeared in the human reference headlines in terms of different n-gram.

Reference

[1] Lopyrev, Konstantin. "Generating news headlines with recurrent neural networks." arXiv preprint arXiv:1512.01712(2015). [2] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. CoRR, abs/1409.3215, 2014. [3] Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations[J]. arXiv preprint arXiv:1802.05365, 2018. [4] Pennington J, Socher R, Manning C. Glove: Global vectors for word representation[C]. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014: 1532-1543. [5] All the news dataset. Kaggle. link [6] Papineni, K.; Roukos, S.; Ward, T.; Zhu, W. J. BLEU: a method for automatic evaluation of machine translation. ACL-2002: 40th Annual Meeting of the Association for Computational Linguistics. Pp. 311-318.

headline_generation_nlp's People

Contributors

elainelinlin avatar satyatumati avatar twjeric avatar achin1311 avatar

Stargazers

John Lins avatar Will Perkins avatar  avatar Fei Gao avatar CyanZz avatar Ciel avatar

Watchers

James Cloos avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.