Git Product home page Git Product logo

bangumi-analysis-coursework's Introduction

Bangumi数据分析

English | 简体中文

这是一份2021届的温州大学《Python应用与开发》课程的大作业。本项目是一个基于Bangumi网站的数据分析工具,可以帮助您分析ACG相关数据的发展趋势和流行程度。

安装

  1. 克隆本仓库到本地:
git clone https://github.com/murlors/Bangumi-Analysis-Coursework.git
  1. 安装依赖:
pip install -r requirements.txt

如果你使用的是conda,可以使用以下命令安装依赖:

conda env create -f environment.yml

使用

  1. 运行爬虫:
python crawler.py -cfg config.yml
  1. 运行分析器:
python analysis.py -cfg config.yml

配置文件

您可以使用config.yml文件来配置爬虫和数据分析的参数。以下是一个示例配置文件:

# config.yml
type: music # anime, book, music, game, real

crawler:
  start: 1 # start page
  end: 50 # end page
  user-agent: your_name/bangumi-analysis (https://github.com/your_name/bangumi-analysis)
  access-token: # access token

data:
  path: 'data' # data path

figure:
  path: 'figures' # figure path
  rcParams: # matplotlib rcParams
    font.family: 'Microsoft YaHei' # font family
    savefig.dpi: 300 # dpi
    figure.figsize: [12, 8] # figure size
    figure.autolayout: True # auto layout

type参数指定了爬虫的类型,可以是animebookmusicgame或者real。 在crawler部分,您可以配置爬虫的参数。startend参数指定了爬虫爬取的页面范围。user-agent参数指定了爬虫的User-Agent。 在data部分,您可以配置数据的保存路径。 在figure部分,您可以配置图像的保存路径和matplotlib的rcParams。

需要注意的是,analysis.py数据分析的部分需要使用crawler.py爬取的数据,因此请确保您已经运行了crawler.py再运行analysis.py

由于本项目原本只用于分析音乐相关数据,若您需要分析其他类型的数据,您需要自行修改crawler.pyanalysis.py中的代码。

贡献

如果您发现了任何问题或者有任何改进意见,请随时提交issue或者pull request。

许可证

本项目使用MIT许可证。详情请参阅LICENSE文件。

bangumi-analysis-coursework's People

Contributors

murlors avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.