Comments (8)
是的,这个分词好慢,不知道是不是作者写的有问题
from jieba-php.
@sinojyj @liupan182 我沒有遇到這個問題呢,剛剛執行了十個句子,包含“我在本地调试的时候,20字的句子切词需要30秒以上,请问会是哪些因素影响速度?”這個句字,大概 5 秒內完成,不過目前的效能的確不比 python 版本。若有可以改善的地方,也請幫忙改善,開源的目的就是在此~
from jieba-php.
结巴分词挺有名的,感谢作者的付出和努力,但是很多分词程序分词几百字只需要不到一秒,这个程序我第一次使用以及看你们线上的demo,速度有点难以接受,线上的我测试大概七秒,我下载后测试也差不多七八秒,觉得稍微慢了点
from jieba-php.
@liupan182 也只能持續改進囉~
from jieba-php.
from jieba-php.
@sinojyj 緩存是一個方向,我在想可以將建構及詞頻的運算結果都緩存,然後客製詞典另外做運算,讓 jieba init 時先讀取緩存結果,大概會快一些吧~ 有空再來試試。
from jieba-php.
from jieba-php.
@sinojyj 感覺你已經下了很多功夫,如果你有什麼成果請發一下 pull request 吧!
from jieba-php.
Related Issues (20)
- 作者你好,提个优化内存消耗和加载字典时间的建议 HOT 7
- 如何在分词完成后,载入停用词表去除停用词 HOT 1
- tp5使用测试占用331M内存 HOT 1
- 如何根据自定义词典,从文本中提取词典中的关键词? HOT 2
- 超出内存限制 HOT 4
- 【优化建议】冗余代码 HOT 2
- 【bug】JiebaAnalyse::init()的options['dict']参数不会生效 HOT 1
- 请求联系方式——any contact way HOT 1
- textrank实现
- 个人整理了关于HMM、Viterbi和中文分词的学习笔记,请交流指导
- 请问一下,自定义添加词条时怎么设置词性,词性可以自定义吗? HOT 2
- 中文操作tip
- 请问有没有日语分词用的词典? HOT 1
- 如何设置初始化参数选择Jieba分词模式? HOT 1
- 初始化之后,内存一直占用着不会释放
- 实现初始化时的性能调优 HOT 1
- 结果中出现了形如 \n2 这样的换行加数字的结果,如何不匹配换行开始的结果呢? HOT 3
- 能否支持下 PHP 8.1 HOT 3
- 因为词典缓存导致内存无限制增加 HOT 1
- 未定义报错Posseg.php:268
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jieba-php.