Comments (4)
Interesting, I can't seem to find any Hunspell compatible dictionary files out there. This project is just a port of the orignial Hunspell which can be found at hunspell/hunspell. That said, if my assumptions are correct, and you have a dotnet background my port is going to be a lot easier to follow along with. I'll be honest, I don't completely understand how it all works, but let me see if I can dig up some clues for you.
So first up, I am totally ignorant to the language but it appears to be right to left which may need to have some special treatment for complex affixes. Within Hunspell this seems to be referred to as a "Complex Affix" language and will set of a ton of string reversals in motion. Another thing to consider, is to be sure to encode your files you would make as UTF-8, it just makes it all so much simpler!
Regarding the Levenshtein distance, I don't know if that is implemented exactly for suggestions, but there is a whole lot of code that runs as part of the suggestions that is at least very similar in how it operates. It's not pretty, but it all starts around here: https://github.com/aarondandy/WeCantSpell.Hunspell/blob/master/src/WeCantSpell.Hunspell/WordList.QuerySuggest.cs#L504 . If you have a test runner that includes test coverage such as NCrunch or the new Visual Studio test runner you can use that to find tests that will cover interesting areas and step through them. The test coverage is pretty decent and can be a huge aid in understanding how it all works.
Hope that helps, getting a new language into Hunspell would be pretty cool.
from wecantspell.hunspell.
Another thought: again I'm no linguist and have no idea what I am talking about but the German language may have some similarities in the way it forms what would be referred to in Hunspell as "Compound Words"
from wecantspell.hunspell.
@aarondandy thank you very much for your reply, your port is definitely a huge help for me. The problem is that there is not much documentation about Hunspell, and creating dictionaries for it. I'll take a look at the German dictionary, to see what I can understand.
from wecantspell.hunspell.
I'm not sure if you have come across this yet or not but maybe this will help: https://github.com/sinaahmadi/KurdishHunspell . This issue is pretty stale and I'm not going to be much help with it, so I'm going to close it. If you still are trying to solve this, the larger community of users in https://github.com/hunspell/hunspell/issues will be more helpful than me.
from wecantspell.hunspell.
Related Issues (20)
- Thank you! HOT 1
- Suggest() method result inconsitent HOT 15
- [Q] Add custom words to loaded dictionary? HOT 5
- Strong-Naming The Library HOT 1
- Any suggestion on how to use this library for real-time word suggestions? HOT 5
- Areas for improvement: Infrastructure HOT 1
- Areas for improvement: Affix HOT 1
- Areas for improvement: Word List HOT 1
- Restore disabled test: allcaps.aff
- Future target frameworks HOT 7
- Suggest algorithm optimization: Levenshtein distance HOT 1
- can i use it as dotnet tool as part of msbuild in csproj? HOT 2
- Occasional System.IndexOutOfRangeException for Suggest HOT 6
- How to ignore punctuation symbols HOT 3
- Parsing text for individual words HOT 3
- First algorithm fails on E5-26xx HOT 5
- Get words that start with X HOT 1
- Some suggestions have incorrect spelling HOT 2
- Support for UWP HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wecantspell.hunspell.