Comments (7)
解决方法已经出来了:
使用swoole搭建一个http服务器,因为是常驻内存,所以加载字典步骤在服务器启动的时候就已经加载好了。然后查询的时候请求接口就Ok了。速度杠杠的。这个是php终极解决方案。
from jieba-php.
@GlaryJoker 感謝,我把這個 issue 留著,讓大家可以參考一下作為一種 solution
from jieba-php.
贴个示例代码
`require_once dirname(DIR).'/vendor/autoload.php';
use Fukuball\Jieba\Jieba;
use Fukuball\Jieba\JiebaAnalyse;
use Fukuball\Jieba\Finalseg;
//Jieba::init(array('mode'=> 'Default','dict' => 'big'));
Jieba::init(array('mode'=>'Search Engine','dict'=>'small'));
Finalseg::init();
JiebaAnalyse::init();
$dictPath = dirname(DIR). '/dict/text_dict.txt';
$stopDictPath = dirname(__DIR__).'/dict/chinese_sw.txt';
Jieba::loadUserDict($dictPath);
JiebaAnalyse::setStopWords($stopDictPath);
$topLimit = 20;
$http = new swoole_http_server("127.0.0.1",9501);
$http->on("request",function($request,$response) use ($topLimit){
$response->header("Content-Type", "application/json; charset=utf-8");
$title = $request->post['title'] ?? 'none';
$content = $request->post['content'] ?? 'none';
$token = $request->post['token'] ?? 'none';
$content = urldecode($content);
$title = urldecode($title);
if($token === 'none') $response->end(json_encode([]));
if($title !== 'none' && mb_strlen($title) > 10){
$titleTags = implode(',',array_keys(JiebaAnalyse::extractTags($title, $topLimit)));
}
if($content !== 'none' && mb_strlen($content) > 15){
$contentTags = implode(',',array_keys(JiebaAnalyse::extractTags($content,$topLimit)));
}
$response->write(json_encode([
'title' => $titleTags ?? 'none',
'content' => $contentTags ?? 'none',
'ini_memory' => ini_get('memory_limit'),
'usage' => memory_get_usage()/1024/1024
],JSON_UNESCAPED_UNICODE));
//$response->end();
});
$http->start();`
nginx 配置
`server {
listen 9583;
server_name www.example.com;
large_client_header_buffers 4 128k;
location / {
proxy_http_version 1.1;
proxy_set_header Connection "keep-alive";
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Transfer-Encoding: "gzip";
proxy_pass http://127.0.0.1:9501;
}
}
`
php代码以守护进程模式运行,速度杠杠的。
from jieba-php.
如何热加载用户自己的词典?
from jieba-php.
如何热加载用户自己的词典?
可以把字典放redis里,需要改一下源代码,自己可以fork一份
from jieba-php.
解决方法已经出来了:
使用swoole搭建一个http服务器,因为是常驻内存,所以加载字典步骤在服务器启动的时候就已经加载好了。然后查询的时候请求接口就Ok了。速度杠杠的。这个是php终极解决方案。
我之前也弄了个简单的服务,基于swoole让字典常驻内存。
https://github.com/wyq2214368/laravel-jieba
from jieba-php.
@wyq2214368 问一下,为什么要controller有构造方法才能在常驻内存?我试了下,controller没有构造方法就没有常驻内存了。
from jieba-php.
Related Issues (20)
- 如何在分词完成后,载入停用词表去除停用词 HOT 1
- tp5使用测试占用331M内存 HOT 1
- 如何根据自定义词典,从文本中提取词典中的关键词? HOT 2
- 超出内存限制 HOT 4
- 【优化建议】冗余代码 HOT 2
- 【bug】JiebaAnalyse::init()的options['dict']参数不会生效 HOT 1
- 请求联系方式——any contact way HOT 1
- textrank实现
- 个人整理了关于HMM、Viterbi和中文分词的学习笔记,请交流指导
- 请问一下,自定义添加词条时怎么设置词性,词性可以自定义吗? HOT 2
- 中文操作tip
- 请问有没有日语分词用的词典? HOT 1
- 如何设置初始化参数选择Jieba分词模式? HOT 1
- 初始化之后,内存一直占用着不会释放
- 实现初始化时的性能调优 HOT 1
- 结果中出现了形如 \n2 这样的换行加数字的结果,如何不匹配换行开始的结果呢? HOT 3
- 能否支持下 PHP 8.1 HOT 3
- 因为词典缓存导致内存无限制增加 HOT 1
- 未定义报错Posseg.php:268
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jieba-php.