Git Product home page Git Product logo

bdci2019-sentiment-classification's Introduction

CCF BDCI 2019 互联网新闻情感分析 复赛top1解决方案

队名:我们都上哈工深

比赛官网:互联网新闻情感分析

复现

本项目下共有三个文件夹,分别为datamodelsource

  1. data文件夹下包含初赛和复赛的训练数据及复赛的测试数据(需官网下载)和数据处理代码。
  2. model文件夹下包含运行的各个模型保存及预测的结果。
  3. source文件夹下包含本次比赛的代码以及模型运行的jupyter notebook文件,共12个。
  4. 复赛融合文件夹下包含融合代码和融合结果。

复现详见"互联网新闻情感分析复现文档-我们都上哈工深.docx"

方案

image-20191224221646530

image-20191224221704779

image-20191224221718455

image-20191224221730191

image-20191224221756483

image-20191224221809071

句子对效果一般,最后没有采用。

image-20191224221836114

image-20191224221848348

image-20191224221900868

最后

代码修改自guoday的baseline ,预训练bert模型采用哈工大的RoBERTa-wwm-ext-largebrightmart的Roberta_zh,十分感谢。

bdci2019-sentiment-classification's People

Contributors

cxy229 avatar xs229 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

bdci2019-sentiment-classification's Issues

引用参考文献

老哥,我想引用你成果作为参考文献,请问你有英文链接吗,我在谷歌上没找到,很急,谢谢老哥!

在运行“004_数据预处理(替换英文字符)”时出错

请问一下,我在复现实验的时候,在运行“004_数据预处理(替换英文字符)时出错”,报错信息:FileNotFoundError: [Errno 2] No such file or directory: 'replacement/train.csv',为什么会找不到replacement文档呢?谢谢!

confusions regarding the model structure

Hi, I'm deeply inpressed by your work about the sentiment classification competition, and I'm trying to learn it. Howerer, I still have some troubles understanding your model.

You guys concat the pooled_output vector of bert with vectors from the first token's(CLS) representation of last three(or two) layers of bert. But isn't the pooled_output vector itself is the first token's vectorof the last layer of bert?
If so, what's point of concating identical vectors? If not the case, then what dose pooled_out vector mean? Is it the average of the CLS representations across all layers of bert?

Hoping for reply.

特征拼接的问题

你好,最近在做文本分类,参照你的方法我把后三层的cls的输出和pooler output拼接成了一个长向量,然后做softmax,但是在训练的时候有时候会出现”Found Inf or NaN global norm. : Tensor had NaN values“的报错,请问你特征拼接的时候有遇到这个问题吗?我想是不是特征的尺度不一样导致的

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.