Git Product home page Git Product logo

meumy-live-showcase's Introduction

Meumy直播弹幕数据展示页

本项目分为静态网页部分和数据处理服务器两部分,纯业余。

静态网页

使用了bootstrap5框架、vue.jsechart.js

数据处理服务器

live_listener.py

将直播弹幕数据记录到sqlite数据库中,可以同时监听多个直播间的数据。

2022-05-25:不再使用bilibili-api包记录直播弹幕数据

Msg_db.py

一个记录弹幕、礼物、直播时间信息的sqlite数据库的简易类。使用peewee作为ORM引擎。会把每天的数据分别存到不同的文件中。

danmu_analyse.py

每十分钟扫描今日的弹幕数据库,如果发现已开播就对弹幕进行分析,生成用于网页显示的json数据文件。

  • 使用scipy包对弹幕量进行平滑以及峰值识别
  • 使用jieba包进行中文分词和关键词提取
    • 用正则表达式过滤掉一些垃圾弹幕,比如ohhhhhhhh
    • 其中使用自定义词典user_dict.txt对分词进行了一点点优化
    • 分词部分还需要调整,有些词无法识别,需要经常维护自定义词典
    • 关键词识别不太适用于网络用语较多的环境,关键词全是“哈哈哈”,我也没啥办法
  • 用正则表达式识别打call弹幕,从而识别出up唱歌的时段
  • 用以往弹幕数据计算了一个新的idf词典idf/idf_live.txt,试图按直播弹幕词频来分析最新直播弹幕关键词,但似乎并不是很有效。有效的例子是呜米2021年9月10日关键词是“确实”,这个确实提取出来了

meumy-live-showcase's People

Contributors

k-bai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

meumy-live-showcase's Issues

数组越界 不知道怎么搞了

(base) [root@VM-20-10-centos server]# python danmu_analyse.py
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.813 seconds.
Prefix dict has been built successfully.
Traceback (most recent call last):
File "/rec/server/danmu_analyse.py", line 378, in
dir_update()
File "/rec/server/danmu_analyse.py", line 275, in dir_update
if live_list[0][0] != None:
IndexError: list index out of range

invalid date

当直播时间出现-2等后缀时会跳转至录播组、搜索词为“invalid date”
Screenshot 2022-10-19 232816

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.