Comments (2)
在config里面没看到停用词典
from elasticsearch-analysis-hao.
你应该是配置了
autoWordLength 参数,不建议使用该参数。
autoWordLength | 根据空格标点符号字母数字等分隔后的汉字文本长度小于autoWordLength会自动识别为一个词语。 默认-1不开启,>=2视为开启
同时,本插件不支持停用词配置以及远程停用词词库。
如有需要,请使用es原生提供的停用词功能。
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-stop-tokenfilter.html
PUT /my-index-000001
{
"settings": {
"analysis": {
"analyzer": {
"default": {
"tokenizer": "whitespace",
"filter": [ "my_custom_stop_words_filter" ]
}
},
"filter": {
"my_custom_stop_words_filter": {
"type": "stop",
"stopwords_path": "停用词路径,每个词一行"
"ignore_case": true
}
}
}
}
}
from elasticsearch-analysis-hao.
Related Issues (20)
- 安装完elasticsearch闪退,elasticsearch版本是7.14.1,分词器版本也是 HOT 1
- 请更新 log4j 版本 HOT 2
- 希望增加命名实体识别和新词发现功能 HOT 2
- 在对短文本分词时,建议将原文本作为分词结果之一返回 HOT 2
- 希望词典支持配置多个文件 HOT 2
- 建议钉钉 url 配置为空时,不走发送消息的逻辑 HOT 1
- hao 分词器支持热词、词频统计吗? HOT 1
- 线程中的 Monitor的lastModified/eTags 没感知到初始化 HOT 1
- Token period exceeds length of provided text sized 3 HOT 1
- html_strip + hao_index_mode+multi_value 的情况下,数据插入错误 HOT 2
- Array类型在某些情况下启用fvh,会导致高亮片段错位的问题 HOT 1
- 远程停词词库怎么配置 HOT 1
- 停用词相关
- 希望增加 hao_max_word 模式 HOT 2
- 标点符号分词的时候都会被去掉吗 HOT 1
- 标点符号分词的时候都会被去掉吗 HOT 1
- 长文本分词卡住的问题 HOT 2
- `㑮` 导致切词异常
- 没有7.17.6版本的呀~
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elasticsearch-analysis-hao.