Git Product home page Git Product logo

liu233w / acm-statistics Goto Github PK

View Code? Open in Web Editor NEW
145.0 2.0 12.0 80.87 MB

An online tool (crawler) to analyze users performance in online judges (coding competition websites). Supported OJ: POJ, HDU, HYSBZ, CodeForces, UVA, ICPC Live Archive, FZU, SPOJ, Timus (URAL), LeetCode_CN, CSU, LibreOJ, 洛谷, 牛客OJ, Lutece (UESTC), AtCoder, AIZU, CodeChef, El Judge, BNUOJ, Codewars, UOJ, NBUT, 51Nod, DMOJ, VJudge

Home Page: https://ojhunt.com

License: GNU Affero General Public License v3.0

JavaScript 33.41% Vue 9.94% Shell 0.24% HTML 1.13% Makefile 2.41% Dockerfile 0.92% C# 51.87% SCSS 0.07%
acm-icpc crawler javascript nodejs docker csharp vue codeforces-api codechef-api spoj-api

acm-statistics's Introduction

This repo contains the source code of OJ Analyzer

简体中文版:README_zh-hans.md

Powered by ZenHub Quality Gate codecov Cypress.io Renovate enabled Mergify Status

All Contributors

Build status

Unit Tests Test E2E

Features

  • Querying ac/submissions of oj
  • Storing querying history

Under development

  • Email support
  • Ranks
  • ……

Directory structure

  • frontend: The front end
  • crawler: Crawlers to query OJs. Being used by both frontend and backend
  • crawler-api-backend: A microservice that provides querying api
  • e2e: E2E tests
  • backend: The back end, a monoservice
  • captcha-service: A microservice that provides captcha support
  • ohunt: A stateful, standalone crawler microservice used to support certain OJs such as ZOJ.
  • build: Codes to build and deploy the project. Tool chain: docker, docker-compose, GNU make.
  • tools: Utility scripts and config files in operation

See the README file in each module for specific documents.

Developing and deploying in docker

  • The project needs docker and docker-compose to function correctly.

Development

  • This project uses makefile to manage dependency between modules. Execute make help in repository root to view document.
  • GNU make is required.

Deploy

There are two ways to deploy this project in a server.

One-liner

Execute following code in shell to deploy the project to port 3000.

curl -s https://raw.githubusercontent.com/Liu233w/acm-statistics/master/tools/remote-docker-up.sh | bash

Vjudge crawler is not available in this way.

Config file version

In this way you are able to customise the configuration, enabling all features.

# Create a folder to store config files
mkdir -p ~/www/acm-statistics
cd ~/www/acm-statistics
# Download runner script and add permissions
curl https://raw.githubusercontent.com/Liu233w/acm-statistics/master/tools/remote-docker-up.sh  -o run.sh
chmod +x run.sh
# Run the script once to generate configuration file. It will exit after the line `.env file created, remember to edit it` is shown.
./run.sh
# Edit the config file following the description in it.
vim .env
# Now we can run the project by the script
./run.sh

Then you can use tools such as systemd to run ./run.sh.

./tools/acm-statistics.service is a template config file of systemd.

run.sh checks updates when it is starting. If there are updates to template.env, run.sh will exit and ask you to compare these two files. The script compares the line count of the two files to check update, please make sure they are identical when editing.

Management

  • Set the url of adminer in .env file. It is /adminer by default.
    • You can view and edit database via adminer.
    • The name of the database is acm_statistics. Username is root. You can set password in .env
  • Backups are created automatically in 3:00am each day, stored in db-backup folder, which is in the folder that contains config files.

License

  • All source code except the code in crawler/crawlers are under AGPL-3.0 license
  • The code in crawler/crawlers are under BSD 2-Clause license.

Contribution

  • All contribution especially crawlers are welcomed.
  • Please follow Commit Message Conventions when writing git commit messages.
  • You may use cz-cli to help writing commit messages.

Contributors ✨

Thanks goes to these wonderful people (emoji key):


Adelard Collins🔗
🐛

BackSlashDelta🔗
🐛

Bodhisatan_Yao🔗
🐛

Geekxiong🔗
🤔

Halorv🔗
🤔

Kido Zhang🔗
🚇 🤔

Liu233w🔗
💻 🤔 🚇 ⚠️

Meulsama🔗
🤔

Michael Xiang🔗
🐛

Zhao🔗
🐛

bluebear4🔗
🐛

ct🔗
🐛

flylai🔗
💻 🐛

fzu-h4cky🔗
🐛

wwawwaww🔗
🐛

zby🔗
🤔 🐛

This project follows the all-contributors specification. Contributions of any kind welcome!

acm-statistics's People

Contributors

allcontributors[bot] avatar ctuu avatar dependabot-support avatar dependabot[bot] avatar flylai avatar imgbotapp avatar liu233w avatar mergify[bot] avatar renovate-bot avatar renovate[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

acm-statistics's Issues

完成主要功能

补完爬虫API、API说明等等。

前端补充“关于我们”页面,完成页面的修缮

新功能:同一个oj查询多个用户名

default

在搜索卡片工具栏上加一个按钮,单击之后可以将卡片复制一份,显示在旁边。

新的卡片上可以填写不同的用户名,在统计列表中,这两张卡片的统计结果应当显示两个条目,而不是合并到一起。

如果同一个OJ显示了两个或以上的卡片,每个卡片都应该添加一个“删除”按钮,可以删除此卡片。

加入停止查询的功能

在查询某些题量较多的网站时,CrawlerWorker在Working状态下没法修改用户名,也没法修改总的用户名。

可以在 worker 中存储 promise,用户在点击“停止”按钮之后强制停止 promise,然后把状态重设到 Waiting

前端:增加将 vjudge 内容合并进其他 CrawlerWorker 统计数的功能

  • 每个 CrawlerWorker 可以显示用户通过的题目列表

    • 比如在用户单击 submission/solved 时显示列表
  • 在统计页面添加一个开关,可以让用户控制在统计数量时是否重复计算 vjudge 的题目

  • 每个 CrawlerWorker 还是显示自己爬出来的结果,最终在统计页面显示汇总出来的结果

  • 后备情况:某个爬虫可能没法读取题目列表,这时候返回的 solved_list 本身应当为 null 或者 undefined

    • 单击显示列表的按钮时应该有提示
    • 如果 vjudge 中有这个oj的题目,应当在 vjudge 中显示一个警告,告诉用户这里统计的题目可能有重复
    • [ ]
  • 美化 WorkerCard: 按钮的 tooltip 显示位置不对;题目列表太挤

前端:生成统计报告

在题量统计页面添加一个“生成报告”按钮,单击后在弹出的对话框中用表格的形式列出题量,在下方用柱状图列出题目情况,参考旧版查询网站的样式。

图片.png

报告对话框还需要包括报告生成的时间,尽量保证报告能在一屏内装下。

其他要求

  • 多个 vjudge 的结果应当先合并到一起,查重之后再合并到其他的oj中 (#385 )

无法访问的OJ

有可能是 OJ 挂了,也有可能是只能在校园网内访问:

  • CSU http://acm.csu.edu.cn/OnlineJudge/userinfo.php?user=
  • HUST http://acm.hust.edu.cn/u/
  • CQU http://acm.cqu.edu.cn/oj/userinfo.php?name=
  • UESTC http://acm.uestc.edu.cn/user/userCenterData/

完成爬虫辅助功能

历史记录、排行榜、邮件提醒、题量追踪……

这些都需要数据库才能完成,然而我并不想用 Node 写后端。所以这一部分要考虑使用 Java 或者 Asp.Net Core 来单独建立一个微服务

编写 axios 的 proxyScriptGenerator

爬虫:vjudge 和其他 oj 合并的逻辑

  • 在配置文件中增加一个字段,标明某个爬虫是 virtual judge 的爬虫

    • 这样便于以后增加更多不同的 vjudge
  • 多个 vjudge 的结果应当先合并到一起,查重之后再合并到其他的oj中 (moved to #22 )

  • 每个不同的 vjudge 应当对返回的题目列表进行标准化: name-problemNumber

    • name 是爬虫的name,在配置文件中设定的
    • problemNumber 是题目编号,应当与其他oj返回的题目编号相同
    • (其他oj在返回结果时不需要附加 oj 的 name,可以让外部的代码自动附加)
  • 应当对爬虫的返回结果进行验证(比如solved_list的元素数量必须和 solved 相同),必要时抛出错误或警告。

  • 由于所有的 CrawlerWorker 在前端都是动态运行的,应当将这些合并的代码放进前端的 computed 里面

    • 以后的后端自动查题功能应该怎么用?
      • 可以将合并的代码抽象到 crawler 里面,后端做邮件通知的时候手动指定计算顺序(先vjudge,后其他oj)进行合并;前端在 computed 里面调用。

征集部分oj的测试账号

我最近正在把 npuacm.info 上的爬虫移植到本网站上,需要一些oj的账号来进行测试。如果您愿意提供自己的账号,请直接在下方回复。

注意:

  • 只需要用户名,不需要密码
  • 测试账号会被写进测试用例中,其他所有浏览本网站源代码的用户都能看到此用户名

需要测试的网站:

  • FZU http://acm.fzu.edu.cn/user.php?uname=
  • Timus OJ http://acm.timus.ru/search.aspx?Str=
  • SPOJ http://www.spoj.com/users/
  • SGU http://acm.sgu.ru/find.php?find_id=

谢谢大家。

添加爬虫API说明页面

  • 给爬虫后端的 /api/crawler 添加说明功能,除了返回 name 之外,还要返回 title description
  • 给爬虫后端写一个 swagger,用来显示接口
  • 添加爬虫API说明页面,引用上面的swagger,展示给用户

完善 makefile 和 docker

  • 添加 make help 并将其作为 make 的默认选项
  • 完善 README.md 的文档
  • 给 docker compose 添加自动更新镜像的功能
    • 包括镜像和 docker-compose 配置文件
  • 想一个使用 dev 模式开发前端的方法
    • 当前直接部署的形式没法使用 hot reload,在容器外运行前端时不方便访问其他容器的数据;把源代码 mount 进去的时候没法自动监控代码改变。

在引入了 VirtualModulePlugin 之后没办法从cmd运行测试了

没办法在cmd运行测试, 但是bash仍然可以。原因不明。

错误:

    ERROR in C:/sources/project/acm-statistics/frontend/node_modules/babel-loader/lib?{"babelrc":false,"cacheDirectory":false,"plugins":[["transform-imports",{"vuetify":{"transform":"vuetify/es5/components/${member}","preventFullImport":true}}],"lodash"],"presets":[["C://sources//project//acm-statistics//frontend//node_modules//babel-preset-vue-app//dist//index.common.js",{"targets":{"ie":9,"uglify":true}}]]}!C:/sources/project/acm-statistics/frontend/node_modules/vue-loader/lib/selector.js?type=script&index=0!C:/sources/project/acm-statistics/frontend/pages/statistics.vue
    Module not found: Error: Can't resolve '~/dynamic/crawlers' in 'C:\sources\project\acm-statistics\frontend\pages'
     @ C:/sources/project/acm-statistics/frontend/node_modules/babel-loader/lib?{"babelrc":false,"cacheDirectory":false,"plugins":[["transform-imports",{"vuetify":{"transform":"vuetify/es5/components/${member}","preventFullImport":true}}],"lodash"],"presets":[["C://sources//project//acm-statistics//frontend//node_modules//babel-preset-vue-app//dist//index.common.js",{"targets":{"ie":9,"uglify":true}}]]}!C:/sources/project/acm-statistics/frontend/node_modules/vue-loader/lib/selector.js?type=script&index=0!C:/sources/project/acm-statistics/frontend/pages/statistics.vue 25:16-45
     @ C:/sources/project/acm-statistics/frontend/pages/statistics.vue
     @ C:/sources/project/acm-statistics/frontend/.nuxt/router.js
     @ C:/sources/project/acm-statistics/frontend/.nuxt/index.js
     @ C:/sources/project/acm-statistics/frontend/.nuxt/client.js

see ad8bcbf

防探测机制?

在服务器上发现了一些探测的痕迹:

Apr 16 02:10:27 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:27.667Z nuxt:render Rendering url /explicit_not_exist_path
Apr 16 02:10:28 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:28.077Z nuxt:render Rendering url /index.html
Apr 16 02:10:28 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:28.331Z nuxt:render Rendering url /index.php
Apr 16 02:10:28 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:28.586Z nuxt:render Rendering url /index.jsp
Apr 16 02:10:28 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:28.838Z nuxt:render Rendering url /admin/
Apr 16 02:10:30 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:30.231Z nuxt:render Rendering url /wp-login.php
Apr 16 02:10:30 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:30.483Z nuxt:render Rendering url /readme.html
Apr 16 02:10:30 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:30.736Z nuxt:render Rendering url /license.txt
Apr 16 02:10:30 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:30.990Z nuxt:render Rendering url /wp-includes/js/wplink.js
Apr 16 02:10:31 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:31.240Z nuxt:render Rendering url /wp-admin/js/customize-controls.js
Apr 16 02:10:31 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:31.549Z nuxt:render Rendering url /wp-admin/js/nav-menu.js
Apr 16 02:10:31 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:31.803Z nuxt:render Rendering url /wp-includes/js/plupload
/handlers.js
Apr 16 02:10:32 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:32.055Z nuxt:render Rendering url /wp-includes/js/tinymce/
wp-tinymce.js.gz
Apr 16 02:10:32 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:32.747Z nuxt:render Rendering url /README
Apr 16 02:10:34 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:34.005Z nuxt:render Rendering url /robots.txt
Apr 16 02:10:35 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:35.043Z nuxt:render Rendering url /phpMyAdmin/
Apr 16 02:10:35 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:35.297Z nuxt:render Rendering url /phpmyadmin/
Apr 16 02:10:35 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:35.560Z nuxt:render Rendering url /pma/
Apr 16 02:10:35 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:35.811Z nuxt:render Rendering url /solr/
Apr 16 02:10:36 izq1mjvj51er6lz npm[8534]: 2018-04-15T18:10:36.064Z nuxt:render Rendering url /wcm/
Apr 19 16:37:59 izq1mjvj51er6lz npm[5455]: 2018-04-19T08:37:59.411Z nuxt:render Rendering url /swagger/elpsycongroo
Apr 19 19:00:08 izq1mjvj51er6lz npm[6122]: 2018-04-19T11:00:08.419Z nuxt:render Rendering url /ZeroClipboard.swf
Apr 19 19:00:08 izq1mjvj51er6lz npm[6122]: 2018-04-19T11:00:08.478Z nuxt:render Rendering url /js/ZeroClipboard.swf
Apr 19 19:00:08 izq1mjvj51er6lz npm[6122]: 2018-04-19T11:00:08.529Z nuxt:render Rendering url /script/ZeroClipboard.swf
Apr 19 19:00:08 izq1mjvj51er6lz npm[6122]: 2018-04-19T11:00:08.580Z nuxt:render Rendering url /lib/ZeroClipboard.swf
May 05 08:51:22 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:22.503Z nuxt:render Rendering url /api.php
May 05 08:51:22 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:22.564Z nuxt:render Rendering url /checktable.php
May 05 08:51:22 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:22.621Z nuxt:render Rendering url /theme/default/images/kindeditor/save.gif
May 05 08:51:22 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:22.686Z nuxt:render Rendering url /js/kindeditor/Makefile
May 05 08:51:22 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:22.733Z nuxt:render Rendering url /theme/default/images/treeview/file.gif
May 05 08:51:22 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:22.780Z nuxt:render Rendering url /js/jquery/treeview/min.js
May 05 08:51:22 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:22.828Z nuxt:render Rendering url /theme/default/images/main/logo.png
May 05 08:51:22 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:22.877Z nuxt:render Rendering url /js/jquery/syntaxhighlighter/scripts/shBrushPlain.js
May 05 08:51:22 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:22.925Z nuxt:render Rendering url /theme/default/style.css
May 05 08:51:22 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:22.981Z nuxt:render Rendering url /js/my.full.js
May 05 08:51:23 izq1mjvj51er6lz npm[459]: 2018-05-05T00:51:23.048Z nuxt:render Rendering url /theme/zui/fonts/zenicon.eot
May 14 17:32:57 izq1mjvj51er6lz npm[460]: 2018-05-14T09:32:57.123Z nuxt:render Rendering url /zfnoxfvj.html

Dependabot couldn't reach r.cnpmjs.org as it timed out

Dependabot couldn't reach r.cnpmjs.org as it timed out.

Is r.cnpmjs.org accessible over the internet? If it is then this may be a transitory issue and can be ignored - Dependabot will close it on its next successful update run.

You can mention @dependabot in the comments below to contact the Dependabot team.

VJ 统计出的题量跟官方的不大一样

新查询网站统计的题量:
default

旧查询网站统计的题量:
234/705

和新网站差了一个,因为老网站每页都有一个重复的没有去掉。参见此处代码

VJudge官网题量:
default

看起来,原来的统计方法没有去除重复的AC,因此结果会不一样。而Submissions是不应该去重的。

正确的统计方法是:

  • solved:去重之后的AC(官网上的Overall solved)
  • submissions: 所有的提交数(查题网站上的Submissions)

让爬虫名称和标题分开

现在有一些爬虫的名字没法当作文件名使用,应该在 name 字段外添加一个 title 字段来表示OJ的实际名称

用户登录功能

  • 用户可以注册、登录
  • 用户在注册的时候需要验证邮箱

优化前端构建

lodash原先是被打包进 page_statistics 文件的,在 345dec7 提交后被打包进 app 了,原因未知。

提升前端加载性能

  • 将 static 中的图片等移动到 assets 中,在打包时生成带哈希值的文件名
  • 配置文件服务,令其不进行 304 检查,而是强制缓存

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.