Comments (2)
@toplevmas 我想了想,这个问题不是那么简单的。如果我们把“Steve Jobs”作为一个词来看,那么会导致另一个问题,即Steve和Jobs这两个词就分不出来了。在精确模式下这个还可以接受,但在搜索引擎模式会有问题,除非我们把Steve和Jobs也加入词典,但这个不现实。
另一个思路是,先按照现有方式分词,然后尝试进行merge,你觉得怎么样?
from jieba.net.
@toplevmas 这个问题在Python版的jieba里也存在,我曾经想修改,没有完成。下面有时间看看。
from jieba.net.
Related Issues (20)
- 如何启用 停用词库,感觉默认的好像没有去除停用词
- 请问自定义词库的方法可以多次调用么,还是说只能调用一次,多个自定义词库文件该如何处理呢? HOT 1
- .Net Core 3.1 LoadUserDict 下,會出現 Unhandled exception. System.UnauthorizedAccessException: Access to the path HOT 2
- 自定义字典中读取中文乱码导致不能匹配
- How to config the ConfigFileBaseDir in Abp framework HOT 2
- 自定义字典问题
- 请问支不支持VS2013
- newtonsoft.json.12.0.3 版本能不能降低成4.几?谢谢 HOT 2
- 今天4:50某某某领了一只记号笔中的4:50为什么不能识别成时间???
- 建议支持日期时间格式,不必拆分成为几个独立的数字,建议日期时间为完整的词
- AddWord
- 词性分析时调用自定义词典的方法 HOT 1
- 想邀请您的项目在 Gitee 同步代码并加入 dotNET China 组织(非微软官方)
- KeywordProcessor提词不准确
- highlighter返回结果问题
- 同义支持 HOT 2
- 不同分词实例加载不同字典怎么实现 HOT 4
- 有Net Core版本的使用示例吗
- JiebaNet.Segmenter是不是没用到停用词
- System.Text.Json 代替 Newtonsoft.Json
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jieba.net.