risent / nlpbamboo Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/nlpbamboo
License: GNU General Public License v3.0
Automatically exported from code.google.com/p/nlpbamboo
License: GNU General Public License v3.0
1.版本
select version
"EnterpriseDB 8.4.4.10 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.4.4,
32-bit"
2.select to_tsvector('chinesecfg','**人');
结果: '**':1 '人':2
在查询中,结果都只有一个,而用一些java分词时可以分到“�
��国人”和“** 人”两个结果。
不知道怎样能够让分词结果有多个?是需要更新词库还是有��
�他办法?
谢谢!
Original issue reported on code.google.com by [email protected]
on 18 Oct 2010 at 10:24
在主题词抽取前,增加可配置的命名实体识别处理,使得主��
�词抽取算法能够结合命
名实体的结果,对实体词进行加权处理
Original issue reported on code.google.com by [email protected]
on 9 Dec 2008 at 2:12
我很需要通过Python来调用Bamboo。
请问什么时候会出Python接口?
我能帮忙吗?
Original issue reported on code.google.com by [email protected]
on 13 Jun 2009 at 1:46
Hi,ifengle
由于采用factory后,我们多出了很多配置文件。请帮忙确定一�
��各个配置文件的命名
以及基本格式。
谢谢
Original issue reported on code.google.com by [email protected]
on 3 Dec 2008 at 2:04
What steps will reproduce the problem?
1. 用lexicon -b -i user_define.idx -s source.txt
將自訂字'高畫質官方版'加入user_define.idx字典
2. 在postgres裡面用select to_tsvector('chinesecfg','高畫質官方版');
3. 結果顯示"'方版':3 '質官':2 '高畫':1"
What version of the product are you using? On what operating system?
nlpbamboo 1.1.1
config 確定是use_single_combine=1
Original issue reported on code.google.com by [email protected]
on 23 Aug 2010 at 9:00
What steps will reproduce the problem?
1. cd /opt/bamboo/exts/postgres/bamboo && make && make install
2. postgresql restart
3.$ psql
postgres=# \i /opt/mookr/postgresql/share/contrib/bamboo.sql
What is the expected output? What do you see instead?
psql:/opt/mookr/postgresql/share/contrib/bamboo.sql:1: ERROR: could not
load library "/opt/mookr/postgresql-8.1.9/lib/bamboo.so":
/opt/mookr/postgresql-8.1.9/lib/libbamboo.so: undefined symbol: bamboo_parse
What version of the product are you using? On what operating system?
Debian etch
Please provide any additional information below.
postgresql 8.1.9
Original issue reported on code.google.com by [email protected]
on 17 Nov 2008 at 10:36
为主题词抽取增加主题短语的抽取
Original issue reported on code.google.com by [email protected]
on 8 Dec 2008 at 8:02
rt.
Original issue reported on code.google.com by [email protected]
on 8 Dec 2008 at 5:55
What steps will reproduce the problem?
1. installed psycopg2
2. install crf/cmake
3. did a makeall on bamboo
4. make install /opt/bamboo
5. tried "cd /opt/bamboo/exts/postgresql/bamboo
$ make "
FAILED
What is the expected output? What do you see instead?
expected to make the file.
error: "make: *** No targets specified and no makefile found. Stop."
What version of the product are you using? On what operating system?
nlpbamboo 1.1.0/Ubuntu 8.04
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 9 Sep 2009 at 6:47
do NOT send error message to stderr, instead using global bamboo_error and
add bamboo_get_error function
Original issue reported on code.google.com by [email protected]
on 12 Nov 2008 at 6:57
目前Bamboo的主页信息需要更新了,例如:
1. changelog是否可以修改为what's new,将change
log中最新的部分展示在这里
2.
Bamboo的功能介绍是否可以更加丰富一些,所有的功能和简单��
�思路描述
或者下周我们一起来讨论一下主页需要如何设计,毕竟这是Ba
mboo的门面,还是很重
要的。
Original issue reported on code.google.com by [email protected]
on 7 Dec 2008 at 4:24
regress_seg.o(.text+0x171): In function `main':
/home/bingzhen/nlpbamboo/source/regress_test/regress_seg.cc:14: undefined
reference to `bamboo::ParserFactory::get_instance()'
regress_seg.o(.text+0x18f):/home/bingzhen/nlpbamboo/source/regress_test/reg
ress_seg.cc:15: undefined reference to `bamboo::ParserFactory::create(char
const*, char const*, bool)'
../../build/opt/bamboo/lib/libbamboo.a(bamboo.cxx.o)(.text+0x204): In
function `bamboo_init':
: undefined reference to `bamboo::ParserFactory::get_instance()'
../../build/opt/bamboo/lib/libbamboo.a(bamboo.cxx.o)(.text+0x21f): In
function `bamboo_init':
: undefined reference to `bamboo::ParserFactory::create(char const*, char
const*, bool)'
Original issue reported on code.google.com by [email protected]
on 5 Dec 2008 at 1:25
whether concat words connected via hyphen shoud be configurable.
Original issue reported on code.google.com by [email protected]
on 28 Nov 2008 at 6:31
去掉Trie里的size_t定义,改为int或者long long int.
结构体对齐采用4字节对齐
Original issue reported on code.google.com by [email protected]
on 4 Jan 2009 at 5:10
What steps will reproduce the problem?
1. I run the auto_build to build the CRF model, but it take over 72 hours.
It is not finished yet. But for CRF2 model, it just take 6 hourse.
2.
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 3 Nov 2008 at 1:55
r516,
make install 时提示:
CMake Error at source/tools/cmake_install.cmake:106 (FILE):
file INSTALL cannot find
"/home/chaik/dev/nlp/nlpbamboo-read-only/source/tools/build_settings".
Call Stack (most recent call first):
source/cmake_install.cmake:69 (INCLUDE)
cmake_install.cmake:38 (INCLUDE)
Original issue reported on code.google.com by [email protected]
on 18 Feb 2010 at 4:29
增加java接口,方便java调用
Original issue reported on code.google.com by [email protected]
on 8 Dec 2008 at 6:36
local branch: hypen
Original issue reported on code.google.com by [email protected]
on 24 Nov 2008 at 2:03
training目录中都是crf的tmpl模板,因此建议改为template目录。
Original issue reported on code.google.com by [email protected]
on 4 Dec 2008 at 1:15
需要rpm但是这边都是i386的,能不能把src.rpm放上来
Original issue reported on code.google.com by [email protected]
on 23 Nov 2009 at 6:55
配置user_combine时,可以选择是用maxforward_combine还是single_combine
,应对不
同的应用场景
Original issue reported on code.google.com by [email protected]
on 8 Dec 2008 at 6:34
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
Please use labels and text to provide additional information.
Original issue reported on code.google.com by [email protected]
on 9 Dec 2008 at 6:15
/home/jianingy/devel/bamboo/source/ycake/ycake.cxx: In member function ‘int
bamboo::ycake::KeywordExtractor::get_keyword(const char*, const char*,
std::vector<std::basic_string<char, std::char_traits<char>,
std::allocator<char> >, std::allocator<std::basic_string<char,
std::char_traits<char>, std::allocator<char> > > >&)’:
/home/jianingy/devel/bamboo/source/ycake/ycake.cxx:109: error:
‘partial_sort’ is not a member of ‘std’
make[2]: *** [source/ycake/CMakeFiles/ycake_shared.dir/ycake.cxx.o] Error 1
make[1]: *** [source/ycake/CMakeFiles/ycake_shared.dir/all] Error 2
make: *** [all] Error 2
Original issue reported on code.google.com by [email protected]
on 24 Nov 2008 at 2:16
增加Parser::setopt函数
Parser::setopt(enum BAMBOO_OPTION, void *value);
enum BAMBOO_OPTION {
BAMBOO_TEXT = 0,
BAMBOO_TITLE
};
C接口使用bamboo_setopt(void *handle, enum BAMBOO_OPTION, void
*value)方式调用。
Original issue reported on code.google.com by [email protected]
on 3 Dec 2008 at 1:58
rt.
Original issue reported on code.google.com by [email protected]
on 3 Dec 2008 at 9:35
自己训练的
使用
./bin/bamboo -p crf_pos
ERROR: max_token_length must greater than 0
这是什么问题,怎么解决?我是64位的操作系统
Original issue reported on code.google.com by [email protected]
on 31 Oct 2009 at 8:40
line 148: s is on the stack and not persistence.
Original issue reported on code.google.com by [email protected]
on 1 Nov 2008 at 3:06
增加一条“词频 词”记录到词典文本的时候忘了 '词频' ,
bin/lexicon -s -i 创建索
引的时候进入了死循环,未有提示
Original issue reported on code.google.com by [email protected]
on 21 May 2009 at 3:27
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
8.3
Please provide any additional information below.
Windows Version build by VS2005/VS2008 etc.
Original issue reported on code.google.com by [email protected]
on 3 Nov 2008 at 1:52
PostgreSQL扩展无字典文件时SegFault
Original issue reported on code.google.com by [email protected]
on 9 Dec 2008 at 6:17
lots of error messages leak of detail information to find where it goes wrong?
we need to fix it.
Original issue reported on code.google.com by [email protected]
on 1 Dec 2008 at 4:52
In file included from
/home/jianingy/devel/bamboo/source/ycake/text_parser.hxx:4,
from
/home/jianingy/devel/bamboo/source/ycake/text_parser.cxx:1:
/home/jianingy/devel/bamboo/source/ycake/ycake_doc.hxx: In destructor
‘bamboo::ycake::YCToken::~YCToken()’:
/home/jianingy/devel/bamboo/source/ycake/ycake_doc.hxx:26: error: ‘free’
was not declared in this scope
/home/jianingy/devel/bamboo/source/ycake/ycake_doc.hxx: In member function
‘void bamboo::ycake::YCToken::set_token(const char*)’:
/home/jianingy/devel/bamboo/source/ycake/ycake_doc.hxx:45: error: ‘free’
was not declared in this scope
make[2]: *** [source/ycake/CMakeFiles/ycake.dir/text_parser.cxx.o] Error 1
make[1]: *** [source/ycake/CMakeFiles/ycake.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
Original issue reported on code.google.com by [email protected]
on 24 Nov 2008 at 2:16
mfm parser不起作用
Original issue reported on code.google.com by [email protected]
on 20 Feb 2009 at 3:23
rt.
Original issue reported on code.google.com by [email protected]
on 8 Dec 2008 at 5:57
请大家看看这个结构,然后回复一下,最后家宁统计确定一��
�最终的版本,并改一下
cmake的安装部署过程。
bin/ : 可执行程序和训练脚本
etc/ : 配置文件
data/ : 训练语料数据(.txt)
template : 训练模板 (.tmpl)
index/: 字典和模型文件
(由于.idx和.model的扩展名不一样,建议分开处理,再创
建一个model目录)
lib/: 库文件
include/: 头文件
exts/: 各种扩展
processer/: processer的so文件,建议合并到lib中
build/ : 我不太明白里面是什么?
Original issue reported on code.google.com by [email protected]
on 4 Dec 2008 at 1:45
In source/bamboo.cxx of v1.1.1:
snprintf is not in the scope.
Original issue reported on code.google.com by [email protected]
on 18 Jun 2009 at 7:25
1. Define classes respectively for Tokenization, NER and KeyWord
2. Each class has its own configuration file
3. configuration file can include others
Original issue reported on code.google.com by [email protected]
on 24 Nov 2008 at 11:58
首先,这个项目很好,表示感谢.
我在一个网站中,使用了postgres数据库,版本是8.3.5
为了支持中文的全文检索,编译了这个,版本是1.1.1
crf++的版本是0.53
网站压力大概平均每天独立IP 1万多点
基本上2天左右,内存会耗尽,在增加这个支持前,很多天都不会�
��过2G,内存占用.
我注意到一个细节,当一个postgres的处理进程处理全文检索的��
�候,这个postgres的内存
占用会非常大, res高的到了500多M,平均300多M,
没有加这个插件的时候,都是几十M.
盼望能早日解决,万分感谢.
Original issue reported on code.google.com by [email protected]
on 26 Jan 2010 at 9:30
/source/mmap/mmap.cxx:35:21: warning: no newline at end of file
/source/utf8/utf8.cxx:95:21: warning: no newline at end of file
/source/trie/double_array.cxx:206:21: warning: no newline at end of file
/source/trie/kvtrie.cxx:35:21: warning: no newline at end of file
/source/trie/datrie.cxx:163:21: warning: no newline at end of file
/source/lexicon/lexicon.cxx:37:21: warning: no newline at end of file
/source/processor/processor.cxx:35:21: warning: no newline at end of file
/source/trie/kvtrie_interface.cxx:54:21: warning: no newline at end of file
/source/config/config_factory.cxx:35:21: warning: no newline at end of file
/source/processor/break_processor.cxx:106:21: warning: no newline at end of
file
/source/processor/crf_processor.cxx:128:21: warning: no newline at end of file
/source/processor/maxforward_processor.cxx:79:21: warning: no newline at
end of file
/source/processor/maxforward_combine_processor.cxx:110:21: warning: no
newline at end of file
/source/processor/single_combine_processor.cxx:156:21: warning: no newline
at end of file
/source/processor/unigram_processor.cxx:122:21: warning: no newline at end
of file
Original issue reported on code.google.com by [email protected]
on 13 Oct 2008 at 7:30
[deleted issue]
rt.
Original issue reported on code.google.com by [email protected]
on 24 Nov 2008 at 2:01
将maxforward-combine应用到所有词而不仅仅时单字词
Original issue reported on code.google.com by [email protected]
on 8 Dec 2008 at 5:56
$ /opt/bamboo/bin/auto_build -t seg -p 2 -d /media/nlp/data/raw
/opt/bamboo/bin/auto_build: line 136: =/opt/bamboo/bin: No such file or
directory
/opt/bamboo/bin/auto_build: line 141: syntax error near unexpected token `fi'
/opt/bamboo/bin/auto_build: line 141: `fi'
是在 r516 不小心改错了,patch 如下:
Index: auto_build
===================================================================
--- auto_build (revision 516)
+++ auto_build (working copy)
@@ -133,9 +133,10 @@
thread_num=1
fi
-$top=$(dirname $(readlink -f $0))
+top=$(dirname $(readlink -f $0))
if [ -r "$top/etc/bamboo/build_settings" ]
source $top/etc/build_settings
+fi
if [ -r "/etc/bamboo/build_settings" ]
source /etc/bamboo/build_settings
fi
Original issue reported on code.google.com by [email protected]
on 18 Feb 2010 at 4:45
when two punc come together,such like : (人名日报),记者
the “)” and "," will be treated as one token "),"
Original issue reported on code.google.com by [email protected]
on 28 Nov 2008 at 5:58
Change __FUNCTION__ to __func__ as info gcc said that
`__FUNCTION__' is another name for `__func__'. Older versions of GCC
recognize only this name. However, it is not standardized. For
maximum portability, we recommend you use `__func__', but provide a
fallback definition with the preprocessor:
Original issue reported on code.google.com by [email protected]
on 12 Nov 2008 at 6:56
本项目中提供的已经训练好的模型只有crf_seg的模型文件,缺�
�� crf_pos、
crf_ner_nr、crf_ner_ns、crf_ner_nt、keyword 的模型文件。
谁又已经训练好的其它的模型文件,提供一下阿。
Original issue reported on code.google.com by [email protected]
on 17 Dec 2009 at 3:16
[jianingy(0)@nby ~/devel/build/bamboo]$ bin/bamboo -p keyword news
ERROR: cannot mmap file
请bingzhen帮忙看看,能否给出一个具体点的错误信息
Original issue reported on code.google.com by [email protected]
on 3 Dec 2008 at 8:47
1. PHP exts
2. Perl exts
3. PG exts
Original issue reported on code.google.com by [email protected]
on 24 Nov 2008 at 2:02
可不可以实现lucene分词接口。
Original issue reported on code.google.com by [email protected]
on 3 May 2009 at 5:48
prepare分出过长的单词会导致crf++ core掉
Original issue reported on code.google.com by [email protected]
on 20 Feb 2009 at 3:23
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.