Git Product home page Git Product logo

nlp-lang's Issues

为什么不放到Maven**仓库去呢?

你都已经有nlpcn 域名了,为什么不把jar发布到maven **仓库呢。让全世界的maven仓库都可以下载到你这个jar,不然还要在pom里添加一个你的额外的源?

拼音标注“日往月来”时出现问题

你好,

如下代码

Pinyin.pinyinWithoutTone("日往月来")

会返回两个字符串:"ri"及"wang yue lai",而非理想的四个字符串:ri wang yue lai
用您的demo网站来标注该词时,也返回同样的情况:
[ri4, wang yue lai2]

请问是我对你的API理解有误,还是这是一个bug?

多谢诸位在NLP开源上的努力!

object of the same name as the administrative directory

svn: Failed to add directory 'E:\Search\eclipse-luna\workspace\nlp-lang\src\main\java\org\nlpcn\commons\lang\finger\util.svn': object of the same name as the administrative directory ,src\main\java\org\nlpcn\commons\lang\finger\util\下的.svn文件

如何加载自定义词典?

ansj加载自定义词典

尝试修改了library.properties, 添加了自定义词典

#path of userLibrary this is default library
dic=library/default.dic

user defined dictionary

dic_name=library/name.dic
dic_company=library/company.dic
dic_term=library/term.dic

#redress dic file path
ambiguityLibrary=library/ambiguity.dic
stop_dic1=library/stop.dic
synonymsLibrary=library/synonyms.dic
#set real name
isRealName=true

#isNameRecognition default true
isNameRecognition=true

#isNumRecognition default true
isNumRecognition=true

#digital quantifier merge default true
isQuantifierRecognition=true

测试程序启动时,也找到了对应的词典文件

六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init dic_term to env value is : library/term.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init stop_dic1 to env value is : library/stop.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init dic_name to env value is : library/name.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init ambiguityLibrary to env value is : library/ambiguity.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init isQuantifierRecognition to env value is : true
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init dic_company to env value is : library/company.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init isRealName to env value is : true
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init synonymsLibrary to env value is : library/synonyms.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init isNumRecognition to env value is : true
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init isNameRecognition to env value is : true
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init dic to env value is : library/default.dic
六月 17, 2017 2:49:30 下午 org.ansj.dic.impl.File2Stream info
信息: path to stream library/ambiguity.dic
六月 17, 2017 2:49:30 下午 org.ansj.library.AmbiguityLibrary info
信息: load dic use time:1 path is : library/ambiguity.dic
六月 17, 2017 2:49:30 下午 org.ansj.dic.impl.File2Stream info
信息: path to stream library/default.dic
六月 17, 2017 2:49:31 下午 org.ansj.library.DicLibrary info
信息: load dic use time:1249 path is : library/default.dic
六月 17, 2017 2:49:32 下午 org.ansj.library.DATDictionary info
信息: init core library ok use time : 736
六月 17, 2017 2:49:32 下午 org.ansj.library.NgramLibrary info
信息: init ngram ok use time :510

分词使用的是 DictAnalysis:
Result terms = DicAnalysis.parse(sent1);

但分词结果一直没有变化,实在是找不到原因了,还望大侠解救。

环境如下:
OS: macOS 10.12.5
JDK: 1.8.0_65
ansj-seg: 5.1.2
nlp-lang: 1.7.2

BTW, 使用 DictLibrary.insert添加新词后没问题。

SmartForest.contains(char c) 方法

public boolean contains(char c) {
	if (this.branches == null) {
		return false;
	}
	**return Arrays.binarySearch(this.branches, c) > -1; // 是否要改为 return Arrays.binarySearch(this.branches, new SmartForest<T>(c)) > -1;**
}

繁简字典读取问题(UTF-8)

在编译器中运行没有问题,但是打包发布之后,简繁字典读取出现问题。
发现是因为客户端的编码问题导致,需要使用UTF-8编码读取。
org/nlpcn/commons/lang/dic/DicManager.java
92行开始
修改前:
private static Forest init(String dicName, InputStream is) {
return init(dicName, new BufferedReader(new InputStreamReader(is)));
}
修改后
private static Forest init(String dicName, InputStream is) {
BufferedReader reader = null;
try {
reader = IOUtil.getReader(is, IOUtil.UTF8);
return init(dicName, reader);
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
} finally {
if (reader != null) {
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return null;
}

pom.xml 中 maven-compiler-plugin 配置需要修改

现在指定 source 和 target 都是1.6,而代码中使用了 try-with-resources 语法(例如DoubleArrayTire.java),会导致 mvn install 之类的命令执行失败。
把 source 和 target 改成7、8或者不指定的话,在本地就可以编译了。

wordalert没有处理符号.

att。
我也不知道这是个什么鬼,全角的.?
总之我这边直接改源码加上了~
在69之后加上这个:
CHARCOVER['.'] = '.';

用户自定义词典不能导入

1.在library.properties里面指定用户自定义词典路径,发现自定义词不能导入
2.利用该方法也不能实现:UserDefineLibrary.loadFile
只有使用UserDefineLibrary.insertWord这种方法插入的用户词可以使用

BF的实现

BF的实现可以参考Guava中的实现,那个实现似乎更好。
建议在项目中使用Guava作为基础类库,里面有非常很棒的特性,可以简化和提高程序的可读性,提高稳定性。
guava是google的基础java类库,没有依赖其他。

FAILED: mvn test

$ java -version
java version "1.8.0_05"
Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode)

$ mvn compile
OK.

$ mvn test: FAILED; see below.

INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for org.nlpcn:nlp-lang:jar:0.2
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-jar-plugin is missing. @ line 78, column 12
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building nlp-lang 0.2
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ nlp-lang ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 3 resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ nlp-lang ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ nlp-lang ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /Users/william/Documents/AtWork/github/nlp-lang/src/test/resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ nlp-lang ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ nlp-lang ---
[INFO] Surefire report directory: /Users/william/Documents/AtWork/github/nlp-lang/target/surefire-reports


T E S T S

Running org.nlpcn.commons.lang.dat.DATMakerTest
Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.106 sec <<< FAILURE!
test(org.nlpcn.commons.lang.dat.DATMakerTest) Time elapsed: 0.043 sec <<< ERROR!
java.lang.NullPointerException
at org.nlpcn.commons.lang.dat.DATMaker.maker(DATMaker.java:76)
at org.nlpcn.commons.lang.dat.DATMakerTest.test(DATMakerTest.java:14)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

loadTest(org.nlpcn.commons.lang.dat.DATMakerTest) Time elapsed: 0.007 sec <<< ERROR!
java.io.FileNotFoundException: 生成模型的路径 (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:131)
at java.io.FileInputStream.(FileInputStream.java:87)
at org.nlpcn.commons.lang.dat.DoubleArrayTire.load(DoubleArrayTire.java:31)
at org.nlpcn.commons.lang.dat.DATMakerTest.loadTest(DATMakerTest.java:31)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

Running org.nlpcn.commons.lang.dat.DATTest
Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 0.051 sec <<< FAILURE!
loadTextTest(org.nlpcn.commons.lang.dat.DATTest) Time elapsed: 0.012 sec <<< ERROR!
java.lang.NullPointerException
at java.io.Reader.(Reader.java:78)
at java.io.InputStreamReader.(InputStreamReader.java:97)
at org.nlpcn.commons.lang.util.IOUtil.getReader(IOUtil.java:63)
at org.nlpcn.commons.lang.util.FileIterator.(FileIterator.java:24)
at org.nlpcn.commons.lang.util.IOUtil.instanceFileIterator(IOUtil.java:200)
at org.nlpcn.commons.lang.dat.DoubleArrayTire.loadText(DoubleArrayTire.java:70)
at org.nlpcn.commons.lang.dat.DoubleArrayTire.loadText(DoubleArrayTire.java:57)
at org.nlpcn.commons.lang.dat.DoubleArrayTire.loadText(DoubleArrayTire.java:93)
at org.nlpcn.commons.lang.dat.DATTest.loadTextTest(DATTest.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

loadTest(org.nlpcn.commons.lang.dat.DATTest) Time elapsed: 0.016 sec <<< ERROR!
java.io.FileNotFoundException: /home/ansj/公共的/pinyin.obj (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:131)
at java.io.FileInputStream.(FileInputStream.java:87)
at org.nlpcn.commons.lang.dat.DoubleArrayTire.load(DoubleArrayTire.java:31)
at org.nlpcn.commons.lang.dat.DATTest.loadTest(DATTest.java:21)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

makerTest(org.nlpcn.commons.lang.dat.DATTest) Time elapsed: 0.02 sec <<< ERROR!
java.lang.NullPointerException
at org.nlpcn.commons.lang.dat.DATMaker.maker(DATMaker.java:76)
at org.nlpcn.commons.lang.dat.DATMaker.maker(DATMaker.java:48)
at org.nlpcn.commons.lang.dat.DATTest.makerTest(DATTest.java:11)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

Running org.nlpcn.commons.lang.finger.FingerprintServiceTest
76cebd01faa63f38b45ea9756d26872c
76cebd01faa63f38b45ea9756d26872c
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.558 sec
Running org.nlpcn.commons.lang.index.MemoryIndexTest
[**]
[**]
[**]
[**]
init ok use time 547
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.538 sec
Running org.nlpcn.commons.lang.jianfan.JianFanTest
草莓是红色的
士多啤棃是紅色的
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.015 sec
Running org.nlpcn.commons.lang.pinyin.PinyinTest
[ma3, wan2, dai4, ma3, ,, ta1, qi3, shen1, guan1, shang4, dian4, nao3, ,, yong4, gun3, tang4, de5, kai1, shui3, wei2, zi4, ji3, pao4, zhi4, yi1, wan3, teng2, zhe5, re4, qi4, de5, lao3, tan2, suan1, cai4, mian4, 。, zhong1, guo2, de5, cheng2, xu4, yuan2, geng4, pian1, ai4, la1, shang4, chuang1, lian2, ,, zai4, hei1, an4, zhong1, xiang3, shou4, zhei4, du2, te4, de, mei3, shi2, 。, zhei4, shi4, xian4, dai4, gong1, ye4, ji3, yi1, tian1, xin1, ku3, lao2, zuo4, de5, ren2, zui4, hao3, de5, kui4, zeng4, 。, nan2, fang1, yi1, dai4, sheng1, chang2, de5, cheng2, xu4, yuan2, sui1, ran2, zai4, jing1, cheng2, duo1, nian2, ,, dan4, reng2, kou3, wei4, qing1, dan4, ,, ta1, men2, wang3, wang3, bu4, jia1, liao4, bao1, ,, you2, lian3, jia2, zi4, ran2, tang3, xia4, de, re4, lei4, bu3, chong1, qia4, dang1, de5, yan2, fen1, 。, ta1, men2, xiang1, xin4, ,, yong4, zhei4, zhong3, fang1, shi4, ,, neng2, gou4, mo3, ping2, si1, kao3, zhe5, xian4, zai4, shi4, bu4, shi4, guo4, qu4, xiang3, yao4, de5, wei4, lai2, er2, dai4, lai2, de5, da4, bu4, fen1, you1, shang1, …, xiao3, li3, de5, fu4, qin1, zai4, nian2, qing1, de5, shi2, hou4, ye3, shi4, cong2, ye2, ye2, shou3, li3, jie1, shou1, le5, zu3, chuan2, de5, dai4, ma3, ,, bu4, guo4, ling4, ren2, jing1, ya4, de, shi4, ,, dao4, le, xiao3, li3, zhei4, yi1, dai4, ,, hen3, duo1, dong1, xi1, dou1, yi2, shi1, le5, ,, dan4, shi4, cheng2, xu4, yuan2, ku3, bi1, de5, wei4, dao4, bao3, cun2, de, shi4, ru2, ci3, de5, wan2, zheng3, 。, , jiu4, zai4, 2, 4, xiao3, shi2, zhi1, qian2, ,, zui4, xin1, de5, xu1, qiu2, cong2, P, M, chu3, chuan2, lai2, ,, wei2, le, de2, dao4, zhei4, fen4, zi4, ran2, de5, kui4, zeng4, ,, ma3, nong2, men5, kai1, ji1, 、, xie3, ma3, 、, diao4, shi4, 、, zhong4, gou4, ,, si4, ji4, lun2, hui2, de5, deng3, dai4, huan4, lai2, zhei4, nan2, de2, de5, feng1, shou1, shi2, ke4, 。, ma3, nong2, zhi1, dao4, ,, xu1, qiu2, de, bao3, xian1, qi1, zhi1, you3, duan3, duan3, de5, liang3, tian1, ,, ma3, nong2, men5, yao4, yi3, zui4, kuai4, de5, su4, du4, dui4, dai4, ma3, jin4, xing2, jing1, zhi4, de5, jia1, gong1, ,, ren4, he2, yi1, ge4, xu1, qiu2, dou1, ke3, neng2, zai4, 2, 4, xiao3, shi2, zhi1, hou4, shi1, qu4, yuan2, ben3, de5, huo2, li4, ,, bian4, cheng2, yi1, wen2, bu4, zhi2, de5, la1, ji1, chuang4, yi4, 。]
[ma, wan, dai, ma, ,, ta, qi, shen, guan, shang, dian, nao, ,, yong, gun, tang, de, kai, shui, wei, zi, ji, pao, zhi, yi, wan, teng, zhe, re, qi, de, lao, tan, suan, cai, mian, 。, zhong, guo, de, cheng, xu, yuan, geng, pian, ai, la, shang, chuang, lian, ,, zai, hei, an, zhong, xiang, shou, zhei, du, te, de, mei, shi, 。, zhei, shi, xian, dai, gong, ye, ji, yi, tian, xin, ku, lao, zuo, de, ren, zui, hao, de, kui, zeng, 。, nan, fang, yi, dai, sheng, chang, de, cheng, xu, yuan, sui, ran, zai, jing, cheng, duo, nian, ,, dan, reng, kou, wei, qing, dan, ,, ta, men, wang, wang, bu, jia, liao, bao, ,, you, lian, jia, zi, ran, tang, xia, de, re, lei, bu, chong, qia, dang, de, yan, fen, 。, ta, men, xiang, xin, ,, yong, zhei, zhong, fang, shi, ,, neng, gou, mo, ping, si, kao, zhe, xian, zai, shi, bu, shi, guo, qu, xiang, yao, de, wei, lai, er, dai, lai, de, da, bu, fen, you, shang, …, xiao, li, de, fu, qin, zai, nian, qing, de, shi, hou, ye, shi, cong, ye, ye, shou, li, jie, shou, le, zu, chuan, de, dai, ma, ,, bu, guo, ling, ren, jing, ya, de, shi, ,, dao, le, xiao, li, zhei, yi, dai, ,, hen, duo, dong, xi, dou, yi, shi, le, ,, dan, shi, cheng, xu, yuan, ku, bi, de, wei, dao, bao, cun, de, shi, ru, ci, de, wan, zheng, 。, , jiu, zai, 2, 4, xiao, shi, zhi, qian, ,, zui, xin, de, xu, qiu, cong, P, M, chu, chuan, lai, ,, wei, le, de, dao, zhei, fen, zi, ran, de, kui, zeng, ,, ma, nong, men, kai, ji, 、, xie, ma, 、, diao, shi, 、, zhong, gou, ,, si, ji, lun, hui, de, deng, dai, huan, lai, zhei, nan, de, de, feng, shou, shi, ke, 。, ma, nong, zhi, dao, ,, xu, qiu, de, bao, xian, qi, zhi, you, duan, duan, de, liang, tian, ,, ma, nong, men, yao, yi, zui, kuai, de, su, du, dui, dai, ma, jin, xing, jing, zhi, de, jia, gong, ,, ren, he, yi, ge, xu, qiu, dou, ke, neng, zai, 2, 4, xiao, shi, zhi, hou, shi, qu, yuan, ben, de, huo, li, ,, bian, cheng, yi, wen, bu, zhi, de, la, ji, chuang, yi, 。]
[m, w, d, m, ,, t, q, s, g, s, d, n, ,, y, g, t, d, k, s, w, z, j, p, z, y, w, t, z, r, q, d, l, t, s, c, m, 。, z, g, d, c, x, y, g, p, a, l, s, c, l, ,, z, h, a, z, x, s, z, d, t, d, m, s, 。, z, s, x, d, g, y, j, y, t, x, k, l, z, d, r, z, h, d, k, z, 。, n, f, y, d, s, c, d, c, x, y, s, r, z, j, c, d, n, ,, d, r, k, w, q, d, ,, t, m, w, w, b, j, l, b, ,, y, l, j, z, r, t, x, d, r, l, b, c, q, d, d, y, f, 。, t, m, x, x, ,, y, z, z, f, s, ,, n, g, m, p, s, k, z, x, z, s, b, s, g, q, x, y, d, w, l, e, d, l, d, d, b, f, y, s, …, x, l, d, f, q, z, n, q, d, s, h, y, s, c, y, y, s, l, j, s, l, z, c, d, d, m, ,, b, g, l, r, j, y, d, s, ,, d, l, x, l, z, y, d, ,, h, d, d, x, d, y, s, l, ,, d, s, c, x, y, k, b, d, w, d, b, c, d, s, r, c, d, w, z, 。, , j, z, 2, 4, x, s, z, q, ,, z, x, d, x, q, c, P, M, c, c, l, ,, w, l, d, d, z, f, z, r, d, k, z, ,, m, n, m, k, j, 、, x, m, 、, d, s, 、, z, g, ,, s, j, l, h, d, d, d, h, l, z, n, d, d, f, s, s, k, 。, m, n, z, d, ,, x, q, d, b, x, q, z, y, d, d, d, l, t, ,, m, n, m, y, y, z, k, d, s, d, d, d, m, j, x, j, z, d, j, g, ,, r, h, y, g, x, q, d, k, n, z, 2, 4, x, s, z, h, s, q, y, b, d, h, l, ,, b, c, y, w, b, z, d, l, j, c, y, 。]
[zheng4, pin3, xing2, huo4, !]
[zheng4, pin3, hang2, huo4, !]
[ma3, wan2, dai4, ma3, ,, ta1, qi3, shen1, guan1, shang4, dian4, nao3, ,, yong4, gun3, tang4, de5, kai1, shui3, wei2, zi4, ji3, pao4, zhi4, yi1, wan3, teng2, zhe5, re4, qi4, de5, lao3, tan2, suan1, cai4, mian4, 。, zhong1, guo2, de5, cheng2, xu4, yuan2, geng4, pian1, ai4, la1, shang4, chuang1, lian2, ,, zai4, hei1, an4, zhong1, xiang3, shou4, zhei4, du2, te4, de, mei3, shi2, 。, zhei4, shi4, xian4, dai4, gong1, ye4, ji3, yi1, tian1, xin1, ku3, lao2, zuo4, de5, ren2, zui4, hao3, de5, kui4, zeng4, 。, nan2, fang1, yi1, dai4, sheng1, chang2, de5, cheng2, xu4, yuan2, sui1, ran2, zai4, jing1, cheng2, duo1, nian2, ,, dan4, reng2, kou3, wei4, qing1, dan4, ,, ta1, men2, wang3, wang3, bu4, jia1, liao4, bao1, ,, you2, lian3, jia2, zi4, ran2, tang3, xia4, de, re4, lei4, bu3, chong1, qia4, dang1, de5, yan2, fen1, 。, ta1, men2, xiang1, xin4, ,, yong4, zhei4, zhong3, fang1, shi4, ,, neng2, gou4, mo3, ping2, si1, kao3, zhe5, xian4, zai4, shi4, bu4, shi4, guo4, qu4, xiang3, yao4, de5, wei4, lai2, er2, dai4, lai2, de5, da4, bu4, fen1, you1, shang1, …, xiao3, li3, de5, fu4, qin1, zai4, nian2, qing1, de5, shi2, hou4, ye3, shi4, cong2, ye2, ye2, shou3, li3, jie1, shou1, le5, zu3, chuan2, de5, dai4, ma3, ,, bu4, guo4, ling4, ren2, jing1, ya4, de, shi4, ,, dao4, le, xiao3, li3, zhei4, yi1, dai4, ,, hen3, duo1, dong1, xi1, dou1, yi2, shi1, le5, ,, dan4, shi4, cheng2, xu4, yuan2, ku3, bi1, de5, wei4, dao4, bao3, cun2, de, shi4, ru2, ci3, de5, wan2, zheng3, 。, , jiu4, zai4, 2, 4, xiao3, shi2, zhi1, qian2, ,, zui4, xin1, de5, xu1, qiu2, cong2, P, M, chu3, chuan2, lai2, ,, wei2, le, de2, dao4, zhei4, fen4, zi4, ran2, de5, kui4, zeng4, ,, ma3, nong2, men5, kai1, ji1, 、, xie3, ma3, 、, diao4, shi4, 、, zhong4, gou4, ,, si4, ji4, lun2, hui2, de5, deng3, dai4, huan4, lai2, zhei4, nan2, de2, de5, feng1, shou1, shi2, ke4, 。, ma3, nong2, zhi1, dao4, ,, xu1, qiu2, de, bao3, xian1, qi1, zhi1, you3, duan3, duan3, de5, liang3, tian1, ,, ma3, nong2, men5, yao4, yi3, zui4, kuai4, de5, su4, du4, dui4, dai4, ma3, jin4, xing2, jing1, zhi4, de5, jia1, gong1, ,, ren4, he2, yi1, ge4, xu1, qiu2, dou1, ke3, neng2, zai4, 2, 4, xiao3, shi2, zhi1, hou4, shi1, qu4, yuan2, ben3, de5, huo2, li4, ,, bian4, cheng2, yi1, wen2, bu4, zhi2, de5, la1, ji1, chuang4, yi4, 。]
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec
Running org.nlpcn.commons.lang.standardization.SentencesUtilTest
**
123.1
你好。
123
hello
word
.
hello
word
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec
Running org.nlpcn.commons.lang.tire.splitWord.SmartGetWordTest
android 3
java 3
**人 3
0
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec
Running org.nlpcn.commons.lang.util.StringUtilTest
true
hello ansj
'ansj','2134','123','123','123'
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec
Running org.nlpcn.commons.lang.util.WordAlertTest
az az az az 09·
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.088 sec

Results :

Tests in error:
test(org.nlpcn.commons.lang.dat.DATMakerTest)
loadTest(org.nlpcn.commons.lang.dat.DATMakerTest): 生成模型的路径 (No such file or directory)
loadTextTest(org.nlpcn.commons.lang.dat.DATTest)
loadTest(org.nlpcn.commons.lang.dat.DATTest): /home/ansj/公共的/pinyin.obj (No such file or directory)
makerTest(org.nlpcn.commons.lang.dat.DATTest)

Tests run: 17, Failures: 0, Errors: 5, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.831 s
[INFO] Finished at: 2014-07-14T18:04:35+08:00
[INFO] Final Memory: 7M/184M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test (default-test) on project nlp-lang: There are test failures.
[ERROR]
[ERROR] Please refer to /Users/william/Documents/AtWork/github/nlp-lang/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

多音字问题

是否考虑下兼容多音字的问题呢,现在多音字拼音分词出来的结果只会取一个读音出来

TagContent标红中英文时有问题

测试代码如下:

TagContent tw = new TagContent("<em>", "</em>");
String content = "abc123";
List<Keyword> keywords = new ArrayList<Keyword>();
keywords.add(new Keyword("abc12", 1.0));
System.out.println(tw.tagContent(keywords, content));

输出结果为:<em>abc12</em>

幫助更新 simp.txt 與 trad.txt

NLPchina/nlp-lang/src/main/resources/simp.txt 與 NLPchina/nlp-lang/src/main/resources/trad.txt 自 2015年7月之後未再更新。

我發現了一些問題. 例如,藴 (以下註為 T2) 是 蘊 (以下註為 T1) 的異體字。而 蕴 (以下註為 S) 是 蘊 (T1) 及 藴 (T2) 的簡體字. 請參見 http://www.cojak.org/index.php?function=code_lookup&term=8574

因此, 基於為了符合上述的異體與簡繁體關係
14197 蘊 (T1) 14197 蕴 (S)
14325 藴 (T2) 14325 藴 (T2)
14347 蘊 (T1) 14347 蕴 (S)
應修正為
14197 蘊 (T1) 14197 蕴 (S)
14325 藴 (T2) 14325 蕴 (S)
14347 蘊 (T1) 14347 藴 (T2)

此問題已在 Unihan 15.1.0 > Unihan_Variants.txt 中更正。 除此之外,我還發現 Unihan 15.0.0 > Unihan_Variants.txt 和 Unihan 15.1.0 > Unihan_Variants.txt 之間還有的其他修正需要合併到 simp.txt 和 trad.txt 裏。

您是否同意讓我根據 Unihan 15.1.0 > Unihan_Variants.txt 來更新 trad.txt 和 simp.txt 呢?
如同意,煩請告知 trad.txt 和 simp.txt 的編纂規則。 多謝。

Raymond Jou 周永瑞
FamilySearch International
[email protected]
(801)240-3871 office 辦公室
(408)568-8989 mobile & text 手機 及 簡訊

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.