Git Product home page Git Product logo

opentaal-hunspell's People

Contributors

kroeckx avatar pandermusubi avatar panderopentaal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opentaal-hunspell's Issues

improvement of suggestions

Adding REP i ij to the .aff
might help in better suggestions for 'bizonder'
REP iele iële might help for industriele etc.

What are the preferred letters to be used in dutch language: "i j" vs. "ij"

(I'm no native speaker for dutch - I know almost nothing about this language)

Only one general question about ICONV and OCONV. Did I understand the hunspell code sources correctly?

In input words which must be spell-checked the characters i j will be replaced by the single character ij.

In contrast to that, for typo correction suggestions the single ij letter gets replaced by i j.

Do you prefer the two letters instead of the single letter if you write texts? When will the single letter be used?

Wrongly reports misspelling when word has a / character around.

I've run into the following problem. When a word starts with or ends with a / or + character, hunspell reports this as an error.
Example:
echo 'Een boer en een jongen. /jongen/ /boer/' | hunspell -l -d nl_NL /dev/stdin

On a language like en_US for example this problem doesn't occur. Any idea why this is happening?

EDIT: I see these characters are defined in WORDCHARS '’0123456789ij.-\/+₂²€@ while in en_US this is just WORDCHARS 0123456789’. When using hunspell-nl in org-mode in Emacs, this definition causes wrongly reported misspellings. Luckily the syntax table can be edited to work around this problem.

hyphenation dictionary

The previous version (2.10-g) also contained a hyphenation dictionary. Are there plans to include that again in 2.20+ ?

Nieuwste versie op GitHub releases maar niet op opentaal.org/bestanden

Ik en een aantal vrienden zijn aan het werk met instructies hoe een gebruiker van TeXstudio eenvoudig dictionaries kan importeren op verschillende operating systems. Waar op Linux hunspell al gewoon beschikbaar is, is het samenstellen van een oxt noodzakelijk op macOS en Windows.

We merkten toevallig op dat onder GitHub releases al 2.20 beschikbaar is (behalve denken we de hyphen files zoals opgemerkt op #9 ), maar nog niet onder Downloads op de website https://www.opentaal.org/bestanden . Is dat effectief de bedoeling (bijvoorbeeld omdat hyphen files nog ontbreken)?

Spell checker produces unexpected suggestions once in a while

No expert on the matter.. And surely no clue on the algorithm.. but sometimes I think; why? And no clue if this being hunspell compounding rules. Or some other stuff: As German has the same tendency: https://bugs.documentfoundation.org/show_bug.cgi?id=139319

gedetalleerde [getallenteerde; getabelleerde]
opinipeilingen [Peilopening]
overengekomen [bovengekomen]
vacinatieprogramma [innovatieprogramma/ navigatieprogramma]
bovnmatige [boonmatige]

hoofdlettergevoligheid [hoofdlettergevoeligheid]

In certain cases I even wonder is there a 'likelood' prediction.. So the change of certain word being used. Certain suggestions just curious.. Boonmatige?

Might have mangled touches multiple different topic here.. but feel free to unwind.

Review ij

Review dictionary for ij where it isn't a digraph, such as in Fiji and bijou

release 2.20.19: dictionary does not work with libreoffice on mac os; hyphenation library missing

i installed release 2.20.19 on mac os 10.15 (catalina) for libreoffice 6.4.7, following instructions on opentaal.org [1]:

  1. copy files nl.aff, nl.dic from the release to the /Users/<username>/Library/Spelling/ directory.
  2. register the dictionary system-wide
    1. from the Apple Menu, open System Preferences...
    2. in System Preferences, choose Keyboard
    3. in Keyboard, choose tab Text
    4. under Spelling, choose Set Up...
      (Set Up... is at the bottom of the list of languages)
    5. for language Nederlands, select 'Nederlands (Library)' and click Done to close the list of dictionaries
    6. close System Preferences

however, this does not register the opentaal dutch dictionary with libreoffice, as becomes clear the following way.

  1. from the LibreOffice menu, open Preferences...
  2. in category Language Settings, choose Writing Aids
  3. next to the list of Available language modules, click Edit...
  4. in the Edit Modules dialog, choose Dutch (Netherlands) for Language

the Edit Modules dialog now lists available modules for Dutch Spelling, Grammar, Hyphenation and Thesaurus. after installing the opentaal dutch dictionary as described above, it shows:

  • Spelling
    macOS Spell Checker
  • Grammar
    LanguageTool
  • Hyphenation
  • Thesaurus

or, in other words: the only available dutch dictionary is the native one from apple; i installed the LanguageTool extension for dutch grammar checking; there are no dutch hyphenation and thesaurus libraries available.

workaround
install the previous dictionary and hyphenation extension [2]. after that, the Edit Modules dialog for Dutch (Netherlands) lists:

  • Spelling
    macOS Spell Checker
    Hunspell SpellChecker
  • Grammar
    LanguageTool
  • Hyphenation
    Libhyphen Hyphenator
  • Thesaurus

or, in other words: there are two dutch dictionaries available (enable/ disable, or change order to your liking); LanguageTool is available for dutch grammar checking; dutch hyphenation is handled by the Hyphenator library.

although i admire your work, i think it may be too early for a release.


[1] OpenTaal 2.0 voor Apple OS X (2.0). (2016). [MacOS]. OpenTaal. https://www.opentaal.org/bestanden/file/5-woordenlijst-2-0-voor-apple-snow-leopard-en-lion

[2] Woordenlijst v 2.10g voor OpenOffice.org 3 (2.10). (2016). [Computer software]. OpenTaal. https://www.opentaal.org/bestanden/file/4-woordenlijst-v-2-10g-voor-openoffice-org-3

How to contribute in extending the dictionary?

Hello!

Is there any way i can contribute in extending the dictionary?

I am using a spellchecker based on https://github.com/wooorm/dictionaries that gets seems to get it's resources for the dutch language from this repository. While using this i notice quite often that pretty basic Dutch words are not present in the dictionary (in nl.dic).

For example words like :

  • prijs
  • hierbij
  • risico
  • bij
  • zijn
  • omschrijving

I looked on the website https://www.opentaal.org but weren't able to find much up-to-date info there. There is a page to report suggestions (https://www.opentaal.org/suggesties) but unfortunatelly i was unable to use it.

I would love to contribute, but am unsure what the best way is to do so.
Any reply would be much appreciated!

Taalhaperingen in de "Beschrijving van de overige bestanden"

Beschrijving van de overige bestanden is:

elements/archaic.tsv (archaïsch), deze woorden zijn die nog wel gebruikt worden, alle zitten in de woordenlijst

"deze woorden zijn die"?

elements/excluded.tsv, deze woorden moeten worden uitgesloten van de spellingcontrole omdat ze verwarrend zijn met een ander woord dat ook correct is en in de meeste gevallen bedoeld is

"verwarrend zijn met"?

elements/inflections.tsv, zijn flexies met hun basiswoorden (soms zijn dat er meerdere) en een flexie kan zelf ook een basiswoord zijn voor een andere flexie als suggestie gegeven worden

"voor een andere flexie als suggestie gegeven worden"?

elements/nosuggest.txt, deze woorden mogen niet als suggestie gegeven worden
elements/objectionable.txt (verwerpelijk), deze woorden zijn verwerpelijk omdat ze (buiten de studie naar dit woord) ze als discriminerend of rasistisch worden ervaren

"ze (...) ze"?
"rasistisch"?

elements/obsolete.tsv (ongebruik), deze woorden zijn in ongebruik geraakt, sommige zitten nog in de woordenlijst (weeuw), sommige niet meer (arre) en sommige zijn fout omdat er een andere spelling van is (pannekoek) of een ander woord voor is plaats is gekomen (chocozoen)

"of een ander woord voor is plaats is gekomen"?

-- Ruud

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.