Comments (6)
Hi,
The DictionaryExporter in LT expects the speller dictionary to be inside a "hunspell" folder:
if (new File(filename).getAbsolutePath().contains("hunspell")) {
FSADumpTool.main("--raw-data", "-d", args[0]);
} else {
FSADumpTool.main("--raw-data", "-x", "-d", args[0]);
}
Taking the polish dict from the hunspell folder I can dump it. But I'm not sure if everything is OK.
from languagetool.
Jaume, I tried to dump the dictionary from the current folder. Then the error will appear. I simply wanted to see if it was encoded properly (because there is an encoding-related bug I discovered:
I don't think hardcoding the folder helps, and -x should work for frequency dictionaries. Otherwise, we cannot say we supply the source, which violates Debian principles - this is why we have documented all decoding procedures so that one could get the original sources. This means, however, that the decoding procedure has to produce readable frequency files, I'm afraid.
See also morfologik/morfologik-stemming#15
from languagetool.
Also see morfologik/morfologik-stemming#35
from languagetool.
So I understand that the problem is that we add the -x
option depending on the hard-coded directory name. Instead we need to look inside the .info
file and see if the fsa.dict.encoder
option is set and only use the -x
option if that is the case. Is that correct?
from languagetool.
@milekpl Could you maybe help with this, i.e. reply to my question above from 2014-09-24?
from languagetool.
@danielnaber: it won't help. The encoder will be set but frequency dictionaries have more data. These data are not dumped properly. I tried to persuade Jaume to add code to dump frequency data but this is not a trivial thing to do, as the source format is XML.
from languagetool.
Related Issues (20)
- [de] Bullet points that aren't a sentence must/should not be written in caps
- Libreoffice + LanguageTool 6.4 java.lang.NullPointerException: Cannot invoke org.languagetool.openoffice.CacheIO.setDocumentPath HOT 1
- Additional suggestion for IVE_I_HAVE_AMERICAN_STYLE rule
- error within vaadin text fields (text input fields vanish on click) HOT 3
- Request for Self-Hosted AI LLM Download Option
- case_sensitive='yes' doesn't work properly in antipattern — 2024-05-29
- ru: false positive with беспроводной
- [LO-Add-on] LanguageTool crashes on Impress — 2024-05-30 HOT 3
- New German suggestion(s) HOT 4
- [MacOS] Add an option to remove the app icon from the menu bar HOT 2
- [de] `Pepsinwein` erroneously marked as error
- Libre office Extension on Ubuntu 22.04
- Valencian language
- Languagetool extension 6.4 gives out an error message while saveing the file. HOT 1
- [en] LT from the command line: don't give useless message [enhancement request] HOT 1
- Phrases aren't matched by rules unless they are the provided example sentences HOT 1
- Languagetool vs WYSIWYG HOT 1
- Shadow DOM breaks CSS inherit HOT 1
- [LO-Add-on] LanguageTool crashes on Impress — 2024-06-25
- [pt] Idea for rule: “naquele” → “no” — 2024-06-25 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from languagetool.