Git Product home page Git Product logo

chinanews-data's Introduction

ChinaNews-Data

It is a real-world dataset for cross-domain emotion distribution learning which was crawled from ChinaNews website. Each zipped file is a collection of news documents from a specific domain. The total numbers of domains and news articles are 39 and around 115000, respectively. In our experiments, four domains which contain top numbers of documents are selected: Economics_Channel (E), Culture_News (C), Law_News (L), and Society_News (S).

Domain categories

  • Economics_Channel: 财经频道
  • Culture_News: 文化新闻
  • Law_News: 法治新闻
  • Society_News: 社会新闻
  • IT_News: IT新闻
  • IT_Channel: IT频道
  • Economics_Center: 财经中心
  • Local_News: 地方新闻
  • Legal_Channel: 法制频道
  • Legal_News: 法制新闻
  • Realty_Channel: 房产频道
  • Realty_News: 房产新闻
  • SAR_News: 港澳新闻
  • Scroll_News: 滚动新闻
  • International_News: 国际新闻
  • Domestic_News: 国内新闻
  • Oversea_Report: 海外华文报摘
  • Chinese_News: 华人新闻
  • Chinese_Report: 华文报摘
  • Chinese_Education: 华文教育
  • Healthy_News: 健康新闻
  • Education_News: 教育新闻
  • Financial_Channel: 金融频道
  • Economics_News: 经济新闻
  • Abroad_Life: 留学生活
  • Energy_Channel: 能源频道
  • Automobile_Channel: 汽车频道
  • Automobile_News: 汽车新闻
  • Overseas_Chinese: 侨乡传真
  • Life_Channel: 生活频道
  • Life_News: 生活新闻
  • The_world_Expo_News: 世博会
  • Taiwan_News: **新闻
  • Sports_News: 体育新闻
  • Photo_News: 图片新闻
  • Securities_Channel: 证券频道
  • Oversea_News: **侨界
  • World_Expo: 中新世博
  • Other_News: 其他新闻

Emotion categories

  • "moved": 感动
  • "sympathy": 同情
  • "boring": 无聊
  • "anger": 愤怒
  • "funny": 搞笑
  • "sad": 难过
  • "delighted": 高兴
  • "not-interested": 路过

Samples

URL: XXXX

Title: XXXX

Category: XXXX

Time: XXXX年XX月XX日

News content: XXXXXX

Total: XX

感动: X

同情: X

无聊: X

愤怒: X

搞笑: X

难过: X

高兴: X

路过: X

chinanews-data's People

Contributors

hostnlp avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.