gxtrobot / bustag Goto Github PK
View Code? Open in Web Editor NEWa tag and recommend system for old bus driver 给老司机用的一个番号推荐系统
License: MIT License
a tag and recommend system for old bus driver 给老司机用的一个番号推荐系统
License: MIT License
有的项目数据源是有演员的,bustag抓取之后演员栏是空白的。
如果演员也是推荐参考条件之一的话会影响模型准确度
/app
dirname:/app/src/bustag/bustag/app
Bustag server starting: version: 0.2.0
CWD: /app
system error
Press Enter to continue ...
Traceback (most recent call last):
File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/app/src/bustag/bustag/app/index.py", line 221, in
input()
EOFError: EOF when reading a line
比如我现在已经有3000个样本了。定时刷新再获取1000个,但是30分钟内并没有1000个新的视频发布,比如只有30个,那会往更往前的旧的历史资料里面获取1000个吗?还是新的30个?还是新的30加旧的970个
0个标签。
试过sample数据库,也是0个标签。
在windows下运行是好的。
看配置里的端口是8080 但是通过 http://{群晖IP}:8080 打不开页面
如题;
这样不同的用户登录后打标可以根据自己的喜好拥有自己的模型。
输入 docker run --rm -d -e TZ=Asia/Shanghai -e PYTHONUNBUFFERED=1 -v ${PWD}/data:/app/data -p 8000:8000 gxtrobot/bustag-app后,出错。
错误提示如下:
C:\Program Files\Docker\Docker\Resources\bin\docker.exe: Error response from daemon: driver failed programming external connectivity on endpoint heuristic_kilby (22adc3bae8fafcecfc8692577f0dcb92f121bbcc18ff9fee9cb39fe4d4c80d52): Error starting userland proxy: /forwards/expose/port returned unexpected status: 500.
不知道什么原因,第一次运行的时候,该下载的东西都下载好了。bustag.exe窗口没有报错。localhost:8000无法访问。
记得上一个版本是刷新后仍然在原来的位置。
ref #17
serving on 0.0.0.0:8000 view at http://127.0.0.1:8000
start download
Job "download (trigger: date[2019-09-16 23:39:29 CST], next run at: 2019-09-16 23:39:29 CST)" raised an exception
Traceback (most recent call last):
File "lib\site-packages\apscheduler\executors\base.py", line 125, in run_job
File "bustag\app\schedule.py", line 18, in download
KeyError: 'download.root_path'
希望最后能做成一个插件,直接和BUS,图书馆等老司机网页联动那就更好了~
到时候不需要本地弄,直接网页那边进行筛选,然后网页直接给推荐~:)
谢谢作者,希望继续加油,哈哈哈~
大姐姐们的特征都取了什么呢,好想知道QAQ。
README下的本地源代码安装中并按照 requirements.txt 的 python 包后应该是并安装requirements.txt 的 python 包后吧
希望增加ajax提交,每次点完喜欢都刷新页面
非DOCK模式,如何修改默认的8000端口
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe3 in position 0: invalid continuation byte
运行环境:Windows 10
点击训练模型后所有数值均为0,命令行提示
\bustag\model\classifier.py:37: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
/do-trainin
按标签可以快速筛选出大量自己喜欢和讨厌的数据以快速完成打标
已阅是因为很多人都会快速打标,但实际喜欢的并没有去下载,因此可以加一个已阅标记,在喜欢列表里真的下载过的可以设置为已阅,方便用户管理自己看过和没看过的数据
嗯对数据。。。
已经标记了400个(喜欢与不喜欢之和)
尚有40多个未标记的。然后点击开始训练。控制台输出这个,并且模型准确率、覆盖率均为0。
且推荐页面没有输出任何一个东西
F:\BaiduNetdiskDownload\bustag_win_0.1.1\bustag\bustag\model\classifier.py:36: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
F:\BaiduNetdiskDownload\bustag_win_0.1.1\bustag\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples.
F:\BaiduNetdiskDownload\bustag_win_0.1.1\bustag\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
tp: 0, fp: 0
INFO:bustag:tp: 0, fp: 0
fn: 7, tn: 92
INFO:bustag:fn: 7, tn: 92
precision_score: 0.0
INFO:bustag:precision_score: 0.0
recall_score: 0.0
INFO:bustag:recall_score: 0.0
f1_score: 0.0
INFO:bustag:f1_score: 0.0
new model trained
INFO:bustag:new model trained
/do-training
127.0.0.1 - - [09/Sep/2019:13:49:29 +0800] "GET /do-training HTTP/1.1" 200 4375 "http://127.0.0.1:8000/do-training" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"
Sorry, the requested URL 'http://localhost:8000/tagit' caused an error:
Internal Server Error
Exception:
OperationalError('database is locked')
打了500多组数据,训练好模型后,给我推荐的就是之前总数据减掉我打过标的内容。
那这个推荐有啥子意义。。。
rt
毕竟除了真的看过这部,只能靠海报、tag或者分时截图来判断对不对口味。
目前海报过小看不清并且tag只展示一个
发现兴致勃勃的下好了docker镜像,最后发现只有x86才能用....
standard_init_linux.go:211: exec user process caused "exec format error"
树莓派docker安装无法运行,是不支持arm架构吗?
安装在群晖docker中,一直运行正常,昨天重启群晖后输入
docker run --rm -d -v $(pwd)/data:/app/data -p 8000:8000 gxtrobot/bustag-app
提示错误:
docker: Error response from daemon: Bind mount failed: '/root/data' does not exists.
新建/root/data后
没有错误提示了,但是docker ps没有项目在运行,浏览器无法访问,求解
6666
我打标了一些数据之后发现虽然我喜欢的和不喜欢的都已经进到了对应的分类里,但是剩下未打标的数据都是我觉得一般的,留下来貌似没什么用啊,请问怎样清空这些数据呢?
如题,希望能够考虑下!
毕竟每个人的口味不一样,先手动添加类别,抓取添加类别的番号,之后进行手动打标,这样可以大量的节省个人手动打标的时间。
$ docker run --rm -v $(pwd)/data:/app/data -p 8000:8000 gxtrobot/bustag-app
/app
dirname:/app/src/bustag/bustag/app
Bustag server starting: version: 0.2.0
CWD: /app
system error
Press Enter to continue ...
Traceback (most recent call last):
File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/app/src/bustag/bustag/app/index.py", line 221, in
input()
EOFError: EOF when reading a line
如现有域名 foo.com,正跑着一个非常正常的网站,这时我希望加一个路径:foo.com/bustag 可以代理到 docker 中的 8000 端口,目前的代码并不支持这种操作,建议支持一下
比对了下今天抓到的和源站首页展示的,有很大一部分从未见到过。
不知为何bustag没有抓下来,更新规则是默认的30分钟抓300个。
所以对更新抓取逻辑有点好奇。
另外有一部分片子的演员未显示出来。
新版海报变大了好评,建议更大一些,或者增加一个点击放大选项
已解决
不然喜欢的太少了,,,弄模型太久
选片靠的不是tag,除非有针对性的选择需求。一般点击靠的是封面的内容是否足够吸引人,主人公是否年轻/漂亮/性感,影片截图是否符合预期。我觉得你可以下一个项目开发基于封面图的机器学习。
后来在.venv下新建了一个.pth文件,将项目根目录路径写入.pth文件后,能正常运行了
不是很熟悉Python!
这里有其他的解决办法吗?
http://127.0.0.1:8000打不开呀?一直在debug
DEBUG:bustag:save tag_item: 4452
save tag: 1160
DEBUG:bustag:save tag: 1160
save tag_item: 4453
DEBUG:bustag:save tag_item: 4453
save tag_item: 4454
DEBUG:bustag:save tag_item: 4454
save tag_item: 4455
DEBUG:bustag:save tag_item: 4455
save tag_item: 4456
DEBUG:bustag:save tag_item: 4456
save tag_item: 4457
DEBUG:bustag:save tag_item: 4457
save tag_item: 4458
DEBUG:bustag:save tag_item: 4458
Error: 500 Internal Server Error
Sorry, the requested URL 'http://0.0.0.0:8000/other' caused an error:
File "/Users/mvpma/PycharmProjects/bustag/bustag/app/index.py", line 97, in other_settings
_, model_scores = clf.load()
NameError: name 'clf' is not defined
有个这个错误,怎么处理呢
如果不导入任何数据库文件的话,是不是不会自动下载任何数据?我用docker部署上去后,并不显示任何推荐等数据,导入数据库文件后,仍然没有任何更新和推荐,我查看了一下容器的日志,如下
/app,
Bottle v0.12.17 server starting up (using PasteServer())...,
Listening on http://0.0.0.0:8000/,
Hit Ctrl-C to quit.,
,
2019-10-08 03:52:24,828 - bustag - WARNING - classifier.py - recommend ,
no data for recommend ,
2019-10-08 03:52:25,909 - aspider - WARNING - crawling.py - exit_on_empty_queue ,
empty queue, now quit ,
2019-10-08 03:52:25,909 - aspider - WARNING - crawling.py - crawl ,
closing the crawler ,
2019-10-08 03:52:25,909 - aspider - WARNING - crawling.py - work ,
canceling the worker ,
2019-10-08 03:52:25,909 - aspider - WARNING - crawling.py - work ,
canceling the worker ,
2019-10-08 03:52:25,909 - aspider - WARNING - crawling.py - work ,
canceling the worker ,
2019-10-08 03:52:25,909 - aspider - WARNING - crawling.py - work ,
canceling the worker ,
请问下MAC下bustag 打开方式是什么呢,默认是文档格式,无法打开
0.2.0 训练模型是提示 not enough values to unpack (expected 4, got 1)
rt
Exception in thread Thread-1:
Traceback (most recent call last):
File "threading.py", line 926, in _bootstrap_inner
File "threading.py", line 870, in run
File "bustag\app\schedule.py", line 48, in start_scheduler
File "lib\site-packages\apscheduler\schedulers\base.py", line 87, in init
File "lib\site-packages\apscheduler\schedulers\base.py", line 126, in configure
File "lib\site-packages\apscheduler\schedulers\asyncio.py", line 48, in _configure
File "lib\site-packages\apscheduler\schedulers\base.py", line 697, in _configure
File "lib\site-packages\tzlocal\win32.py", line 93, in get_localzone
File "lib\site-packages\tzlocal\win32.py", line 84, in get_localzone_name
pytz.exceptions.UnknownTimeZoneError: 'Can not find timezone '
2000余条数据导入2.0 之后只剩下24条。
没有报错,导入时命令行显示有2000+条,导入完成后自动更新,最后 ALL DONE。
win10 x64 ver 1903 18362.387
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.