skroutz / elasticsearch-skroutz-greekstemmer Goto Github PK
View Code? Open in Web Editor NEWGreek Stemmer for elasticsearch
Home Page: https://www.skroutz.gr
Greek Stemmer for elasticsearch
Home Page: https://www.skroutz.gr
Φίλε, μήπως μπορείς να μου πεις τι χρειάζεται να κάνω για να το ανανεώσω για ES 6?
Hi,
First of all I would like to congratulate you guys for the enhanced greek stemmer you worked on for the elasticsearch platform. I believe that Usage example is needed as well as a test case scenario to be sure that we have done the correct configuration.
I can't install the plugin, can you please help?
cd /usr/share/elasticsearch && sudo bin/plugin --install skroutz/elasticsearch-skroutz-greekstemmer/2.4.4.1
-> Installing skroutz/elasticsearch-skroutz-greekstemmer/2.4.4.1...
Trying http://download.elasticsearch.org/skroutz/elasticsearch-skroutz-greekstemmer/elasticsearch-skroutz-greekstemmer-2.4.4.1.zip...
Trying http://search.maven.org/remotecontent?filepath=skroutz/elasticsearch-skroutz-greekstemmer/2.4.4.1/elasticsearch-skroutz-greekstemmer-2.4.4.1.zip...
Trying https://oss.sonatype.org/service/local/repositories/releases/content/skroutz/elasticsearch-skroutz-greekstemmer/2.4.4.1/elasticsearch-skroutz-greekstemmer-2.4.4.1.zip...
Trying https://github.com/skroutz/elasticsearch-skroutz-greekstemmer/archive/2.4.4.1.zip...
Trying https://github.com/skroutz/elasticsearch-skroutz-greekstemmer/archive/master.zip...
Failed to install skroutz/elasticsearch-skroutz-greekstemmer/2.4.4.1, reason: failed to download out of all possible locations..., use --verbose to get detailed information
The problem is that UpdateStemmingSamples.java reads the file with UTF-8 encoding and replaces it with a file using the default encoding of the building computer. Subsequent builds fail.
Proposed changes (lines 27, 28):
FileOutputStream fileWriter = new FileOutputStream(file.getAbsoluteFile());
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(fileWriter, StandardCharsets.UTF_8));
You should use parenthesis to resolve ambiguity here:
else if (len > 4 && endsWith(..) ..
to
else if (len > 4 && (endsWith(..) ...
Αποτυχία εγκατάστασης σε elasticsearch 2
ERROR: Could not find plugin descriptor 'plugin-descriptor.properties' in plugin zip
Hi!
Could you compile a new branch for ES 5.5.2?
Thank you :)
Cannot install on latest ES due to error:
sudo bin/elasticsearch-plugin install gr.skroutz:elasticsearch-skroutz-greekstemmer:5.4.2.1
-> Downloading gr.skroutz:elasticsearch-skroutz-greekstemmer:5.4.2.1 from maven central
[=================================================] 100%
Warning: sha512 not found, falling back to sha1. This behavior is deprecated and will be removed in a future release. Please update the plugin to use a sha512 checksum.
ERROR: This plugin was built with an older plugin structure. Contact the plugin author to remove the intermediate "elasticsearch" directory within the plugin zip.
Any chance for an update here?
Version 1.1, Index :
"index":{
"analysis":{
"analyzer":{
"analyzer_startswith":{
"tokenizer":"keyword",
"filter":"lowercase"
},
"prefix-test-analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter" : ["lowercase","stem_greek"]
}
},
"filter" : {
"mynGram" : {
"type" : "nGram",
"min_gram" : 2,
"max_gram" : 50
},
"stem_greek": {
"type":"skroutz_stem_greek"
}
},
"tokenizer": {
"prefix-test-tokenizer": {
"type": "path_hierarchy",
"delimiter": "."
}
}
}
}
Should have the same stem, but "Μονόφωτα" stems το "μονοφω" while "Μονόφωτο" stems to "Μονόφωτ"
How can be installed on ELK 5.1 or later?
Here is what I get:
sudo /usr/share/elasticsearch/bin/plugin -install skroutz/elasticsearch-skroutz-greekstemmer/0.0.1-> Installing skroutz/elasticsearch-skroutz-greekstemmer/0.0.1... Trying http://download.elasticsearch.org/skroutz/elasticsearch-skroutz-greekstemmer/elasticsearch-skroutz-greekstemmer-0.0.1.zip... Trying http://search.maven.org/remotecontent?filepath=skroutz/elasticsearch-skroutz-greekstemmer/0.0.1/elasticsearch-skroutz-greekstemmer-0.0.1.zip... Trying https://oss.sonatype.org/service/local/repositories/releases/content/skroutz/elasticsearch-skroutz-greekstemmer/0.0.1/elasticsearch-skroutz-greekstemmer-0.0.1.zip... Trying https://github.com/skroutz/elasticsearch-skroutz-greekstemmer/zipball/v0.0.1... (assuming site plugin) Failed to install skroutz/elasticsearch-skroutz-greekstemmer/0.0.1, reason: failed to download out of all possible locations..., use -verbose to get detailed information
rule0
of SkroutzGreekStemmer.java
tries to handle special cases for specific word endings.
However, most of those cases concern whole words, rather than endings.
Eg. the word περατοσ
is handled as an ending, and will also match υδατοπερατοσ
and stem it as υδατοπερ
, σαφωσ
will match φωσ
, etc.
Those case are false positive matches.
Most of the cases should be handled with string equality (rather than string suffix matching).
This should happen in an extra step before what now is rule0
and rule0
should have less special cases to handle
The exceptional cases of the various analysis steps are not uniformly handled.
Some are static variables and some are coded into if
clauses.
All of them are hardcoded and can only change by altering the source files.
We can make an effort to
Hi, I want to install it in my current elasticsearch which is v.7.17.6
Installation fails with the following message:
Exception in thread "main" java.lang.IllegalArgumentException: Plugin [elasticsearch-skroutz-greekstemmer] was built for Elasticsearch version 7.7.0 but version 7.17.6 is running
How can I update the code for my current elasticsearch version?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.