Comments (5)
Sorry of lacking sufficient documentation. JATE2 can be used as a library without much effort.
As mentioned in Quick Start, You can either 1) download jar from maven repository or add following configuration in your maven project along with Dragontools.
<dependency>
<groupId>uk.ac.shef.dcs</groupId>
<artifactId>jate</artifactId>
<version>2.0-beta.1</version>
</dependency>
Once you have setup JATE2 libraries, you are able to use all the available ATE algorithms in your application/project. Our App* shows the example how to use and integrate ATE algorithms with Apache Solr. All the available ATE implementations are subclass of uk.ac.shef.dcs.jate.algorithm.Algorithm
in the package of uk.ac.shef.dcs.jate.algorithm.*
. Current method/interface should be fairly straightforward to use by simply providing a list of candidate terms and corresponding features. The method will then return ranked terms modelled by uk.ac.shef.dcs.jate.model.JATETerm
with scores and other features/metadata. Since JATE2 relies on Solr to perform pre-processing and feature extraction, you have to implement your own method or use Solr or our embedded Solr implementation (i.e., App* ) to parse and extract candidates and features from your corpus.
We will introduce more documentations in near future.
Thanks for your interests.
from jate.
I tried
AppCValue.main(("uk.ac.shef.dcs.jate.app.AppCValue -corpusDir " + corpusDir + " -o cvalue-terms.json " + solrDir + "/testdata/solr-testbed ACLRDTEC").split(" "));
but
uk.ac.shef.dcs.jate.JATEException: Cannot find expected field: jate_ngraminfo
at uk.ac.shef.dcs.jate.util.SolrUtil.getTermVector(SolrUtil.java:36)
at uk.ac.shef.dcs.jate.feature.FrequencyTermBasedFBMaster.build(FrequencyTermBasedFBMaster.java:39)
at com.scholarfriend.maven.Epollo.Tools.AppCValue.extract(AppCValue.java:93)
at com.scholarfriend.maven.Epollo.Tools.AppCValue.extract(AppCValue.java:85)
at uk.ac.shef.dcs.jate.app.App.extract(App.java:285)
I have pdf, txt, and html file under the folder.
from jate.
Logger: com.softcorporation.util.Logger
Mon Feb 27 01:33:46 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:46 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:47 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:47 EST 2017 loading done
Mon Feb 27 01:33:47 EST 2017 loading done
Mon Feb 27 01:33:47 EST 2017 loading done
Mon Feb 27 01:33:47 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:48 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:48 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:48 EST 2017 loading done
Mon Feb 27 01:33:48 EST 2017 loading done
2017-02-27 01:33:48 ERROR SolrCore:525 - [jateCore] Solr index directory 'A:\eclipse\lib\jate-master\testdata\solr-testbed\jateCore\data\index/' is locked. Throwing exception.
2017-02-27 01:33:48 ERROR CoreContainer:740 - Error creating core [jateCore]: Index locked for write for core 'jateCore'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
org.apache.solr.common.SolrException: Index locked for write for core 'jateCore'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
at org.apache.solr.core.SolrCore.(SolrCore.java:820)
at org.apache.solr.core.SolrCore.(SolrCore.java:659)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:727)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:447)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:438)
at java.util.concurrent.FutureTask.run(Unknown Source)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.lucene.store.LockObtainFailedException: Index locked for write for core 'jateCore'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:528)
at org.apache.solr.core.SolrCore.(SolrCore.java:761)
... 9 more
Mon Feb 27 01:33:48 EST 2017 loading done
2017-02-27 01:33:48 ERROR SolrCore:525 - [GENIA] Solr index directory 'A:\eclipse\lib\jate-master\testdata\solr-testbed\GENIA\data\index/' is locked. Throwing exception.
2017-02-27 01:33:48 ERROR CoreContainer:740 - Error creating core [GENIA]: Index locked for write for core 'GENIA'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
org.apache.solr.common.SolrException: Index locked for write for core 'GENIA'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
at org.apache.solr.core.SolrCore.(SolrCore.java:820)
at org.apache.solr.core.SolrCore.(SolrCore.java:659)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:727)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:447)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:438)
at java.util.concurrent.FutureTask.run(Unknown Source)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.lucene.store.LockObtainFailedException: Index locked for write for core 'GENIA'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:528)
at org.apache.solr.core.SolrCore.(SolrCore.java:761)
... 9 more
2017-02-27 01:33:48 INFO AppCValue:72 - Start CValue term ranking and filtering for whole index ...
uk.ac.shef.dcs.jate.JATEException: Cannot find expected field: jate_ngraminfo
at uk.ac.shef.dcs.jate.util.SolrUtil.getTermVector(SolrUtil.java:36)
at uk.ac.shef.dcs.jate.feature.FrequencyTermBasedFBMaster.build(FrequencyTermBasedFBMaster.java:39)
at uk.ac.shef.dcs.jate.app.AppCValue.extract(AppCValue.java:86)
at uk.ac.shef.dcs.jate.app.AppCValue.extract(AppCValue.java:77)
at uk.ac.shef.dcs.jate.app.App.extract(App.java:285)
at uk.ac.shef.dcs.jate.app.AppCValue.main(AppCValue.java:48)
I removed all the file in the data folder but still got these messages.
from jate.
To run AppCValue programmatically, the main method accepts run-time parameters from the string array with the same order as the command line format.
The problem of your implements is that you should not provide class name as parameter if you directly run AppCValue programmatically.
So try with the following:
AppCValue.main(("-corpusDir " + corpusDir + " -o cvalue-terms.json " + solrDir + "/testdata/solr-testbed ACLRDTEC").split(" "));
To make it more clearly, you can try with the following code:
String[] cvalueArgs = new String[6];
cvalueArgs[0] = "-corpusDir";
cvalueArgs[1] = <YOUR_CORPUS_DIR>;
cvalueArgs[2] = "-o";
cvalueArgs[3] = <YOUR_JSON_FILE_PATH>;
cvalueArgs[4] = <YOUR_SOLR_HOME_PATH>;
cvalueArgs[5] = <YOUR_SOLR_CORE_NAME>;
AppCValue.main(cvalueArgs);
Hope it helps.
from jate.
Thank you it works.
from jate.
Related Issues (20)
- NullPointException when loading dragon nlp resource with Apache Solr 5.5 or above HOT 1
- improve handling of unsuccessfully content extraction
- Cannot build with sbt since Dragon tool not linked HOT 1
- Purging index between JATE calls HOT 1
- payload in the development of lucene plugin pipelines
- Jate stops working after couple of corpuses HOT 2
- solr did not shut down cleanly HOT 6
- null value causes chisquare to exit without warning, exit with 0 HOT 1
- export ranked term candidates with surface form
- NullPointerException when generating ngram from empty content HOT 1
- Upgrade JATE to use latest Solr? HOT 4
- Your valuable support: share your use case of JATE with us HOT 4
- Timeout waiting for all directory ref counts to be released HOT 5
- Caused by: java.lang.NoSuchMethodError: org.apache.solr.common.SolrInputDocument.<init>([Ljava/lang/String;) HOT 2
- Google group does not exist HOT 2
- Example SOLR configuration for German text corpus? HOT 1
- Porting to ESearch? HOT 2
- Some problem with dependencies HOT 1
- TermComponentIndex sort problem. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jate.