Comments (3)
@wolf8210137 有關鍵詞提取功能,使用 TF/IDF 演算法,請見 Readme 功能 3):關鍵詞提取
from jieba-php.
cut可以加一个返回完整的分词带idf和词性的数组的选项吗, 返回结果类似这样
array(21) {
[0]=>
array(2) {
["word"]=>
string(3) "这"
["idf"]=>
double(8) 1.22223333
["tag"]=>
string(1) "r"
}....
}
用google的simhash算法做文章的相似度比较,需要文章的全部分词的权重,同时做情感分析需要分词的词性
也就是说,在cut的返回结果里把idf和词性也都带上就好了@fukuball
from jieba-php.
是可以花時間加上這樣的功能,看有沒有人要幫忙,或是等我有空 XD
from jieba-php.
Related Issues (20)
- 作者你好,提个优化内存消耗和加载字典时间的建议 HOT 7
- 如何在分词完成后,载入停用词表去除停用词 HOT 1
- tp5使用测试占用331M内存 HOT 1
- 如何根据自定义词典,从文本中提取词典中的关键词? HOT 2
- 超出内存限制 HOT 4
- 【优化建议】冗余代码 HOT 2
- 【bug】JiebaAnalyse::init()的options['dict']参数不会生效 HOT 1
- 请求联系方式——any contact way HOT 1
- textrank实现
- 个人整理了关于HMM、Viterbi和中文分词的学习笔记,请交流指导
- 请问一下,自定义添加词条时怎么设置词性,词性可以自定义吗? HOT 2
- 中文操作tip
- 请问有没有日语分词用的词典? HOT 1
- 如何设置初始化参数选择Jieba分词模式? HOT 1
- 初始化之后,内存一直占用着不会释放
- 实现初始化时的性能调优 HOT 1
- 结果中出现了形如 \n2 这样的换行加数字的结果,如何不匹配换行开始的结果呢? HOT 3
- 能否支持下 PHP 8.1 HOT 3
- 因为词典缓存导致内存无限制增加 HOT 1
- 未定义报错Posseg.php:268
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jieba-php.