cessen / kobo_jp_dict Goto Github PK
View Code? Open in Web Editor NEWA Japanese-English dictionary builder for Kobo e-readers.
License: Apache License 2.0
A Japanese-English dictionary builder for Kobo e-readers.
License: Apache License 2.0
Firstly, I want to say that this project is fantastic. I spent a lot of time trying to install custom Japanese dictionaries using other programs, but this one was the only one that worked flawlessly. Being able to use a well-known and comprehensive Japanese-Japanese dictionary like 大辞林 is a large improvement over the built-in Japanese-Japanese dictionary. Also a big thumbs up for using Rust!
I was wondering if I could request a command-line flag for not adding the English grammar tags to the dictionary entries (e.g. "verb, ichidan, intransitive" or "i-adjective, irregular"). That way, the Japanese-Japanese dictionary can be completely in Japanese.
The code for generating the final dictionary may be useful to more projects than just this one, especially if we make it non-language-specific. The resulting crate would likely be pretty straightforward, API-wise. It would just take a list (or iterator) of ([terms], definition)
pairs and a Writer
, and write the dictionary out.
The terms
part of the pair should be a list of some kind, so that all the conjugations of a word can be included as search terms. The definition
part would simply be a big String
containing the html of the entry that gets displayed to the user.
(Based on prior email discussion, migrated to this issue)
I am looking to convert a 研究社 Yomichan dictionary for potential use with my Kobo. However, I've run into some difficulties.
cargo run kobo_jp_dict -y 研究社 新和英大辞典 第5版.zip dicthtml-ja-en.zip
On running this command in Cygwin, I am met with the following error:
cargo run kobo_jp_dict -y 研究社 新和英大辞典 第5版.zip
On running this command, I am met with:
Unlike with the first command, which I based on the example in the readme, this one seems to work(?) in that the program appears to spend roughly 3 minutes doing something—however, at the end of the process, there doesn't seem to be an output file anywhere.
By sorting files by Date Modified in Everything, I'm able to see files such as MARISA-BUILD.EXE-908E7366.pf and KOBO_JP_DICT.EXE-CF27C84A.pf appearing/being accessed while the command is running, which implies to me that the program itself and Marisa are functioning throughout. Whether or not as intended, I'm not sure.
Attempts at troubleshooting so far:
For reference, here are what my file paths look like, in case that might help:
(Sorry for the image wall, not sure what will and won't be useful.)
Any and all guidance would be appreciated. Thank you for your time.
The idea is to have the same results I have with yomichan (I'm using JMNedict, JMDict, KireiCake, Shinmekai and Daijisen), I'm curious if it would be possible to combine multiple dictionaries into one and have the same on my Kobo.
Also not sure if i would be able to use them even as a separated dict, they're json not XML.
Never worked with dictionaries so sorry if this is a dumb question!
Hello, I was trying to use the tool to generate my own kobo dictionary but I'm running into the following error:
$ kobo_jp_dict -y jmdict_english.zip test.zip
Extracting bundled data...
Metadata entries: 191598
Pitch Accent entries: 124135
Loading dictionaries...
jmdict_english.zip entries: 267130
Writing dictionary to disk...
thread 'main' panicked at 'assertion failed: process_output.status.success()', src/kobo.rs:75:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
I'm running the program on Ubuntu 22.04 with kobo_jp_dict version 0.1.0 and MARISA 0.2.6.
Right now we have to call the marisa-build
executable as part of the dictionary build process, which is annoying because it requires the user to install a completely separate tool. It would be great to get rid of that dependency.
At the moment, I don't think there are any existing crates for handling the marisa format, so this would be a somewhat major undertaking--we'd need to either write bindings for the marisa library or write our own marisa file builder.
Hey there,
Super cool project, and something I've been looking for! I added the dictionary you compiled, but wanted to make my own as well, and have been having some trouble doing that. In a perfect world, I'd love a dictionary that's JA-JA, and scrolls down to an English definition, but it doesn't seem like that's possible, correct? As such, I was hoping to make a monolingual dictionary, and just swap to an English one if need be.
I downloaded this repo as a zip and unzipped it in my downloads folder. I'm running MacOS 13.2. I've got a bunch of yomichan dictionaries in my downloads folder, as well. I navigated into the kobo_jp_dict-master
folder in terminal, and tried to run cargo run kobo_jp_dict -y ../明鏡国語辞典.zip -y ../大辞林第三版.zip dicthtml-ja.zip
, as the dictionaries are a folder up, and this seemed to be the correct notation based on the other issue as well as the documentation under --help
.
However, running that gives me this error:
error: Found argument 'dicthtml-ja.zip' which wasn't expected, or isn't valid in this context
Where did I go wrong? Tried to change the directory the dictionaries were stored in, as well as compiling from only one dictionary instead of two, but same deal.
Additionally, these sorts of frequency dictionaries aren't supported, are they? I like to have some sort of reference on how common a word might be so I can know whether or not it's worth to make a new flashcard for, but it's not a huge deal if not. If I can get this working with the monolingual side though, I'll try and do a JA-EN version with JMDict, KanjiDic, KireiCake, etc.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.