shuopei / fudannlp Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 0 B

Automatically exported from code.google.com/p/fudannlp

fudannlp's People

Watchers

fudannlp's Issues

tree的例子好像没有啊，源码中好像也没有


1. 模型的路径，是否有参数可指定？

Original issue reported on code.google.com by [email protected] on 20 Jan 2011 at 4:19

词性标注的文档哪里可以下？

What steps will reproduce the problem?
1.
2.
3.

What is the expected output? What do you see instead?


What version of the product are you using? On what operating system?


Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 14 Mar 2012 at 2:41

php jdk

class NplRequest{
    private $fudanUrl = "http://jkx.fudan.edu.cn/fudannlp/";
    private $connecttimeout = 20;
    private $timeout = 10;
    private $ssl_verifypeer = FALSE;

    public $http_code;
    public $http_info = array();
    public $url;

    function npl($key,$str){
        $response = $this->http($this->fudanUrl.$key."/".$str,"GET","");
        return $response;
    }

    function http($url,$method,$param){
        $ci = curl_init();
        curl_setopt($ci, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_0);
        curl_setopt($ci, CURLOPT_CONNECTTIMEOUT, $this->connecttimeout);
        curl_setopt($ci, CURLOPT_TIMEOUT, $this->timeout);
        curl_setopt($ci, CURLOPT_RETURNTRANSFER, TRUE);
        curl_setopt($ci, CURLOPT_ENCODING, "");
        curl_setopt($ci, CURLOPT_SSL_VERIFYPEER, $this->ssl_verifypeer);
        curl_setopt($ci, CURLOPT_SSL_VERIFYHOST, 1);
        curl_setopt($ci, CURLOPT_HEADER, FALSE);

        if($method == "POST"){
            curl_setopt($ci, CURLOPT_POST, TRUE);
        }else{
            $url = "{$url}?{$param}";
        }
        curl_setopt($ci, CURLOPT_URL, $url );
        curl_setopt($ci, CURLINFO_HEADER_OUT, TRUE );

        $response = curl_exec($ci);
        $this->http_code = curl_getinfo($ci, CURLINFO_HTTP_CODE);
        $this->http_info = array_merge($this->http_info, curl_getinfo($ci));
        $this->url = $url;
        curl_close ($ci);
        return $response;
    }
}

Original issue reported on code.google.com by [email protected] on 13 May 2013 at 9:24

如何使用自定义词性标签？

测试POSTagger的时候在dict_pos.txt加入了自定义词性标签，运行��
�现错误：
自定义词性标签只能在下面列表中：...

请问如何使用自定义词性标签？

Original issue reported on code.google.com by [email protected] on 5 Jun 2013 at 8:02

No Update ?

这么长时间没有更新的原因是啥？

Original issue reported on code.google.com by [email protected] on 5 Apr 2012 at 12:33

A small bug

"今天好不热闹"的词性标记结果是：
今天/时间短语
好不热闹/标点

Original issue reported on code.google.com by [email protected] on 18 May 2013 at 9:01

一些bad case

浙江省了大批投资
浙江省了解这个情况的人不多
从北京经济南下徐州
发展**家服装需求大增
我们提供高档和服务必前来选购
我们提供高档设备和服务。
这台计算机系统盘出了故障
丹东西安全是我喜欢的地方
南京的市长江大桥说南京市长江大桥好长
这事儿的确定不下来

Original issue reported on code.google.com by [email protected] on 12 Jul 2013 at 10:41

句法分析抛出异常，超内存怎么办?

What steps will reproduce the problem?
1.句法分析抛出异常Exception in thread "main" 
java.lang.OutOfMemoryError: Java heap space
2.at java.lang.reflect.Array.newArray(Native Method)
    at java.lang.reflect.Array.newInstance(Unknown Source)
3.

What is the expected output? What do you see instead?


What version of the product are you using? On what operating system?


Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 21 Jul 2013 at 8:42

词性标注器POSTagger类中的SetTagType函数在1.5版的fudannlp程序中并无实现

edu.fudan.nlp.cn.tag.POSTagger类中的SetTagType函数应该是设置词性标�
��标签的类型的吧？但是在1.05和1.5两个版本中均无实现

Original issue reported on code.google.com by [email protected] on 31 Mar 2013 at 10:28

时间识别有时不准，比如2009年以前，识别成2004年

时间识别有时不准，比如2009年以前，识别成2004年

Original issue reported on code.google.com by [email protected] on 7 Feb 2013 at 5:56

是否有事件提取的计划？

事件提取。。。。。
情感评估。。。。。

Original issue reported on code.google.com by [email protected] on 1 Jul 2011 at 4:03

FudanNLP-bin-0.95.zip 词性标注出错

下载工程，导入eclipse中，运行实例代码PartsOfSpeechTag.java，出�
��下错误

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.lang.reflect.Array.newArray(Native Method)
    at java.lang.reflect.Array.newInstance(Array.java:52)
    at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1630)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1322)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
    at edu.fudan.ml.struct.classifier.Linear.readModel(Linear.java:116)
    at edu.fudan.ml.struct.classifier.Linear.readModel(Linear.java:111)
    at edu.fudan.nlp.tag.POSTagger.<init>(POSTagger.java:42)
    at PartsOfSpeechTag.main(PartsOfSpeechTag.java:20)

Original issue reported on code.google.com by [email protected] on 2 Jun 2011 at 10:12

websphare 500错误了

今天看websphare 报错了，啥时候能好啊？
我自己搭建了一个java服务，但是需要内存很大，很难承受得�
��的。求重启。

Original issue reported on code.google.com by [email protected] on 27 May 2013 at 8:54

[deleted issue]

[deleted issue]

FudanNLP-bin-0.95.zip中模型文件缺失


./example-data/text-classification/model.gz

没有，只有训练文本train.txt

Original issue reported on code.google.com by [email protected] on 11 Jul 2011 at 2:35

新手：找不到针对新版的example的实例

网页上的例子肯定是旧的了，能不能举出新的example？

例如SEG--->MYSEG

Original issue reported on code.google.com by [email protected] on 1 Jul 2011 at 5:46

请问下如何使得词性标注也能够使用用户词典？

请问下如何使得词性标注也能够使用用户词典？

Original issue reported on code.google.com by [email protected] on 30 May 2012 at 10:50

句法分析中数字的含义？

请问，1.5版本中，Tree（语法树）中包含的数字的含义是什么�
��？

Original issue reported on code.google.com by [email protected] on 20 Dec 2012 at 5:22

句法依存分析实例中的打印出来的那些数字是什么意思啊？

句法依存分析实例中的打印出来的那些数字是什么意思啊？

Original issue reported on code.google.com by [email protected] on 13 May 2011 at 2:56

分词所用词典的记录条数是多少

请问分词所用词典是自己编写的词典还是引用别的词典，大��
�有多少条记录啊？
另外自己能增加带词性的用户词典么？

期待回复

Original issue reported on code.google.com by [email protected] on 31 May 2013 at 2:06

源代码下载后有错误

源代码下载后有错误，希望能给出没有错误的版本，谢谢!

Original issue reported on code.google.com by [email protected] on 25 Aug 2010 at 3:21

依存关系在实例中没有给出？

既然YamadaParser给出的heads没有意义那语法树是怎么的到的呢？

Original issue reported on code.google.com by [email protected] on 15 Oct 2011 at 11:08

0.95版本没有源代码

希望作者能共享源代码。

另外，未知这个项目是基于什么license？

Original issue reported on code.google.com by [email protected] on 30 Jan 2011 at 4:09

词性标注

每一个段落开始都会有一个空的词性标注（有词性标注，但��
�词）

Original issue reported on code.google.com by [email protected] on 27 Mar 2013 at 2:07

切词建议

"天龙八部委托销售"
切词有错

Original issue reported on code.google.com by [email protected] on 29 Dec 2012 at 4:55

机器学习里面的层次分类是指？

机器学习里面的层次分类是指？

Original issue reported on code.google.com by [email protected] on 1 Jul 2011 at 7:14

1.05缺失源文件

导入eclipse项目后提示缺失两个文件夹 test 和 lab

Original issue reported on code.google.com by [email protected] on 6 Nov 2011 at 5:09

1.5版句法分析测试是否内存泄漏

在运行DepParser 这个类的时候 ，报 
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
将本机的内存调整为512后，还是报上述剖，是否有内存泄露��
�题，还是其它原因？

Original issue reported on code.google.com by [email protected] on 15 Mar 2013 at 8:11

请问能提取所有的辞典里词语信息么

能从现在的.m文件中提取出辞典内的词语么

比如分行列出所有的词语 词性 信息么？

Original issue reported on code.google.com by [email protected] on 10 Jul 2013 at 6:25

找不到这个文件ner.p110722.gz

ner.p110722.gz 这个文件 开发了吗？
谢谢！

Original issue reported on code.google.com by [email protected] on 18 Nov 2011 at 7:42

调用依存句法分析出错

使用1.5版本的依存句法分析出错。调用JointParser parser = new 
JointParser("models/dep.m");时报内存不够的错误（已设置-Xmx2048m）�
��

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.lang.reflect.Array.newArray(Native Method)
    at java.lang.reflect.Array.newInstance(Unknown Source)
    at java.io.ObjectInputStream.readArray(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.readArray(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.readObject(Unknown Source)
    at edu.fudan.nlp.parser.dep.YamadaParser.loadModel(Unknown Source)
    at edu.fudan.nlp.parser.dep.YamadaParser.<init>(Unknown Source)
    at edu.fudan.nlp.parser.dep.JointParser.<init>(Unknown Source)
    at com.netease.wordseg.SegDemo.testFudanNLP(SegDemo.java:48)
    at com.netease.wordseg.SegDemo.main(SegDemo.java:72)

Original issue reported on code.google.com by [email protected] on 22 Nov 2012 at 9:34

能支持自定义词性标签吗？

dict_pos.txt中加了一个自定义词性标签，测试出异常：
edu.fudan.util.exception.LoadModelException: 
自定义词性标签只能在下面列表中：...

请问能支持自定义词性标签吗？

Original issue reported on code.google.com by [email protected] on 3 Jun 2013 at 4:40

恭喜1.0发布


介绍下1.0的新特性把

自定义词典和停用词能实现么？

Original issue reported on code.google.com by [email protected] on 1 Aug 2011 at 7:22

使用下载包中的example里的edu.fudan.example.nlp.ChineseWordSegmentation测试出现以下错误:

不使用词典的分词：
媒体 计算 研究所 成立 了 , 高级 数据 挖掘 ( data mining ) 很 
难 。
媒体 计算 研究所 成立 了 , 高级 数据 挖掘 ( data mining ) 很 
难 。

设置临时词典：
java.lang.ArrayIndexOutOfBoundsException: 1

    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:39)

使用词典的分词：
媒体计算研究所 成立 了 , 高级 数据挖掘 很 难

使用不严格的词典的分词：
媒体计算研究所 成立 了 , 高级 数据挖掘 很 难
我 送给 力学系 的 同学 一 个 玩具 ( 送给 给力 力学 力学系 
都 在 词典 中 )

处理文件：
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
    at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
    at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
    at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)


但当开启英文预处理后，又不会出现以上错误：
32行的语句注释后：tag.setEnFilter(false);
这是什么原因？

Original issue reported on code.google.com by [email protected] on 3 May 2013 at 3:18

请问，新闻博客的URL 自动抽取正文，这部分功能开源没有？

根据新闻或博客的URL 自动抽取正文，这部分功能开源没有？
谢谢！

Original issue reported on code.google.com by [email protected] on 24 Oct 2011 at 9:32

有没有计划跟UIMA整合

RT, http://uima.apache.org/

Original issue reported on code.google.com by [email protected] on 1 Dec 2011 at 8:33

demo cli fails

java -classpath fudannlp.jar;lib/commons-cli-1.2.jar;lib/trove
jar; edu.fudan.nlp.cn.tag.CWSTagger -s models/seg.m 
"复旦自然语言处理是垃圾。"

https://code.google.com/p/fudannlp/wiki/fudannlp_cli

Original issue reported on code.google.com by [email protected] on 9 May 2013 at 7:56

句法分析的标注集在哪可以下载？

知道的请回复。

欢迎交流，QQ:7570599

Original issue reported on code.google.com by [email protected] on 9 May 2011 at 6:53

如何使用自定义词性标签？

测试POSTagger的时候在dict_pos.txt加入了自定义词性标签，运行��
�现错误：
自定义词性标签只能在下面列表中：...

请问如何使用自定义词性标签？

Original issue reported on code.google.com by [email protected] on 5 Jun 2013 at 8:05

能不能處理大量文本

不知道能不能處理大部份的資料
我想把檔案掉進去
然後把結果來做人文研究

謝謝

Original issue reported on code.google.com by [email protected] on 19 Mar 2013 at 2:21

hangge works badly

I've never seen such a stupid robot.

Original issue reported on code.google.com by [email protected] on 1 Aug 2011 at 5:14

PHP

PHP调用接口，有没有文档说明下呢？

Original issue reported on code.google.com by [email protected] on 25 Jul 2013 at 3:27

当句子有字母时分词错误

FudanNLP1.05, 
或者使用在线demo （http://jkx.fudan.edu.cn/nlp/fudannlp.do） 
对如下句子分词：

 VB对动态网页支持不够好

期待结果：至少单词VB后面应该分界：VB 对 动态 网页 支持 
不够 好
程序结果：VB对 动态 网页 支持 不 够 好

如果是训练语料存在这种误差，那么应该进行预处理/后处理�
��采用rules来切分不同字符集之间的混合句子。

Original issue reported on code.google.com by [email protected] on 5 Jun 2013 at 1:28

NERTagger处理以空格开头的文本时异常

版本：1.0

重现步骤：
1. 构建一个文本，以空格（半角或全角）开头。
2. 创建NERTagger对象，装载模型。
3. 用这个NERTagger的tag方法处理这个文本。

实际结果：
tag函数返回一个空的哈希，没有抛出异常；但是标准错误流��
�出了如下异常栈：
java.lang.ArrayIndexOutOfBoundsException: -1
    at edu.fudan.ml.inf.struct.LinearViterbi.getPath(LinearViterbi.java:100)
    at edu.fudan.ml.inf.struct.AbstractViterbi.getBest(AbstractViterbi.java:21)
    at edu.fudan.ml.classifier.Linear.predict(Linear.java:42)
    at edu.fudan.nlp.tag.NERTagger.tag(NERTagger.java:32)
    at com.github.wks.tdtutils.segment.FudanNER.tag(FudanNER.java:20)
    at nerdiagnose.NerDiagnose.main(NerDiagnose.java:22)

期望的结果：
1. 前导空格应该被忽略。
2. 
可以规定NERTagger必须处理某些规范的句子或者篇章，但是如��
�输入是非法的，那么：
   1. 如果可以容错，那么应该给出正确结果，不应该显示异常。如果要记录，应该使用日志记录（如slf4j等框架）。
   2. 如果这个错误是致命的，那么这个异常就应当立即抛出，程序不应该继续执行。

总之，在catch中用e.printStackTrace()来处理异常，然后让程序继��
�执行，是不可靠的。

Original issue reported on code.google.com by [email protected] on 7 Sep 2011 at 2:51

人名识别不准

What steps will reproduce the problem?
1.人是会死的，柏拉图是人

人名识别不是很准确

Original issue reported on code.google.com by [email protected] on 25 Jan 2013 at 6:53

SetTagType("en")之后所有的标记都变成NULL

        POSTagger posTagger = new POSTagger("./models/seg.m", "./models/pos.m");
        posTagger.SetTagType("en");
        System.out.println(posTagger.tag("Paper is a thin material mainly used for writing upon, printing upon, drawing or for packaging."));

POS结果如下
Paper/null is/null a/null thin/null material/null mainly/null used/null 
for/null writing/null upon/null ,/null printing/null upon/null ,/null 
drawing/null or/null for/null packaging/null ./null

Original issue reported on code.google.com by [email protected] on 29 Mar 2013 at 3:01

是否可在Python中调用？

想在Python中调用fudannlp
我知道可在Python中调用Java语言
不知道fudannlp这个package是否可以在Python中直接import
谢谢

Original issue reported on code.google.com by [email protected] on 12 Jun 2013 at 6:44

请问如何制作models中的.gz文件？

请问可以添加自定义的词典吗？谢谢～～

Original issue reported on code.google.com by [email protected] on 19 Mar 2012 at 5:59

分词有待改善之处

重现步骤
1.针对“穿上日本和服装嫩”进行汉字分词

期望结果
”穿上 日本 和服 装嫩 “

实际得到的结果
“穿 上 日本 和 服装嫩”


使用的版本
webservice: http://jkx.fudan.edu.cn/fudannlp/

Original issue reported on code.google.com by [email protected] on 6 Apr 2011 at 1:44

1.05版本分词器分词bug

发现1.05版本的分词器对于标点和英文单词的分词不是特别好

        tag = new CWSTagger("./models/seg.c7.110918.gz",         "./models/dict.txt");
        System.out.println("\n使用词典");
        str = "今天的#NEXT WAVE#新星是一位“天之骄子”";
        s = tag.tag(str);
        System.out.println(s);
今天的#NEXT WAVE#新星是一位“天之骄子”
会把#NEXT WAVE#分成#NEXT/WAVE#

今天的NEXT WAVE新星是一位“天之骄子”
会把NEXT WAVE分成NEXTWAVE

自定义词典中并无这些单词，请问分词是否仍有特殊配置？

Original issue reported on code.google.com by [email protected] on 30 May 2012 at 6:33

shuopei / fudannlp Goto Github PK

fudannlp's People

Watchers

fudannlp's Issues

Recommend Projects

Recommend Topics

Recommend Org