Git Product home page Git Product logo

jmdict-kindle's Introduction

About

This is a Japanese-English dictionary based on the JMdict and JMnedict and Tatoeba database for e-Ink Kindle devices.

Features:

  • lookup of inflected verbs.
  • lookup for Japanese names.
  • Example sentences
  • Pronunciation
  • the dictionaries can be downloaded as separate files or as one big dictionary
Inflection lookup screenshot Sentence lookup screenshot
Word lookup screenshot Name lookup screenshot

Supported Devices

The dictionary has been tested on Kindle Paperwhite and Kindle Oasis. It should also work well with other e-ink Kindle devices

The dictionary will not work well on Kindle Fire or Kindle Android App, or any Android based Kindle, because the Kindle software on those platforms does not support inflection lookups.

Download

You can download the latest version of the dictionary from here.

Install

e-Ink Kindle

There are in total 3 dictionaries:

  • jmdict.mobi: Contains data from the JMedict database, with additional examples. It does not contain proper names.
  • jmnedict.mobi: Contains Japanese proper names from the JMnedict databse.
  • combined.mobi: Contains the data from both of the above dictionaries. Please note that a lot of features are missing from the combined dictionary (sentences, pronunciation, ...) due to size constraints. Therefore, it is not suggested to use this dictionary.

To install any of the dictionaries (you can also install all three of them) into your device follow these steps:

  • for 1st-generation Kindle Paperwhite devices, ensure you have firmware version 5.3.9 or higher as it includes improved homonym lookup for Japanese;
  • connect your Kindle device via USB;
  • copy the the .mobi file for the dictionary you want to use to the documents/dictionaries sub-folder;
  • eject the USB device;
  • on your device go to Home > Settings > Device Options > Language and Dictionaries > Dictionaries and set JMdict Japanese-English Dictionary as the default dictionary for Japanese.

Kindle Android App

NOTE: Unfortunately the Kindle Android App does not support dictionary inflections, yielding verbs lookup practically impossible. No known workaround.

  • rename jmdict.mobi or any of the other two dictionaries as B005FNK020_EBOK.prc

  • connect your Android device via USB

  • copy B005FNK020_EBOK.prc into Internal Storage/Android/data/com.amazon.kindle/files/ or /sdcard/android/data/com.amazon.kindle/files

This will override the default Japanese-Japanese dictionary.

Kindle iOS App

The steps for iOS App are similar the Android App above. Unfortunately the Kindle iOS App seems to suffer from the same limitations regarding inflections.

Pitch accent information

The pitch accent information is encoded in the following way:

  • Underline for Low
  • No Formatting for High
  • ꜜ for a sudden Drop in pitch
  • ° for a Nasal sound
  • If no formatting whatsoever is present then we do not have pitch information for that particular entry

Examples:

  • たい means L-H-H
  • が°ꜜ means L-Hꜜ-L
  • んしん means L-H-H-H
  • とꜜ means L-Hꜜ-(L) [The (L) means the next sound after ひと will be low. E.g. ひとが (L-H-L)]

For more information see Japanese pitch accent - Wikipedia

Building from source

Build

Requirements:

  • Linux, Windows with Cygwin or WSL (might also work on macOS with a few changes)

  • Kindle Previewer if building on Windows or WSL Kindle Previewer

    • Kindle Previewer has to be added to PATH. If normally installed add it by executing (for this change to take effect, please close all cmd and powershell windows):
    Set-ItemProperty -Path 'Registry::HKEY_CURRENT_USER\Environment' -Name PATH -Value ((Get-ItemProperty -Path 'Registry::HKEY_CURRENT_USER\Environment' -Name PATH).path + ";$env:APPDATA\Amazon")
  • Python version 3

Inside of the makefile you can change the max number of sentences per entry, compression, as well as which sentences to include:

# The Kindle Publishing Guidelines recommend -c2 (huffdic compression),
# but it is excruciatingly slow. That's why -c1 is selected by default.
# Compression currently is not officially supported by Kindle Previewer according to the documentation
COMPRESSION ?= 1

# Sets the max sentences per entry only for the jmdict.mobi.
# It is ignored by combined.mobi due to size restrictions.
# If there are too many sentences for the combined dictionary,
# it will not build (exceeds 650MB size limit). The amount is limited to 0 in this makefile for the combined.mobi
SENTENCES ?= 5

# This flag determines wheter only good and verified sentences are used in the
# dictionary. Set it to TRUE if you only want those sentences.
# It is only used by jmdict.mobi
# It is ignored bei combined.mobi. There it is always true
# This is due to size constraints.
ONLY_CHECKED_SENTENCES ?= FALSE

# If true adds pronunciations to entries. The combined dictionary ignores this flag due to size constraints
PRONUNCIATIONS ?= TRUE

# If true adds additional information to entries. The combined dictionary ignores this flag due to size constraints
ADDITIONAL_INFO ?= TRUE

Build with make to create all 3 dictionaries (Note the combined dictionary will not build with Kindle Previewer due to size constraints):

make

or use any of the following commands to create a specific one:

make jmdict.mobi
make jmnedict.mobi
make combined.mobi

If you build it on WSL the commands are as follows:

make ISWSL=TRUE

or use any of the following commands to create a specific one:

make jmdict.mobi ISWSL=TRUE
make jmnedict.mobi ISWSL=TRUE
make combined.mobi ISWSL=TRUE

Create a Pull Request

Before making a pull request please ensure the formatting of your python code is correct. To do this please install black and run

black .

To do

  • Leverage more of the JMdict data:

    • cross references
  • Add Furigana to example sentences

  • Create better covers

Credits

Alternatives

jmdict-kindle's People

Contributors

jrfonseca avatar mymro avatar oldmerkum avatar seanblue avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jmdict-kindle's Issues

Invert order of combined dictionary

@jrfonseca
I'm using the combined dictionary on a Kindle, and I noticed that the name lookups are always given as first result, before the regular dictionary lookups.

This is annoying in the case where a common word also happens to be a name. This is because 9 times out of 10, the regular dictionary definition is what you are actually looking for, and the name match is just a coincidence. After a few hours reading, I can report that this happens very frequently, forcing you to swipe after the name definition(s) until you get to the actual intended result.

It would be much better if the common words had precedence over the names, so the word is first looked up in the regular dictionary, and if it is not found there, as a fallback, it is looked up in the name dictionary.

Please, let me know your thoughts. Thanks for the great job!

Issue with reading for some words

Lookup for some words don't provide the common reading e.g.
人間の住める環境とはいえない場所に...
人間 lookup is じんかん instead of にんげん

Making other translation than english

Hello, could you help me with that? I don't understand programming.
I would like to final have a Japanese > Polish dictionary, but I don't know how to edit the source files where I can convert English to Polish to be able to final create a new dictionary on kindle that will work, I want you to explain it to me so I can already translate it myself

Pitch information README

I love the new addition of pitch information in the dictionary, however I have troubles understanding it.
The up and down arrows are very clear, but the underscore really confuses me.
I tried to google but it is not helping.

I think it would be useful if you could add a section in the readme that briefly explains the notation.

Remove combined dictionary from Windows build

The windows build sometimes randomly fails due to the size of the combined dictionary. We might have to remove the combined dictionary from the windows build in the future if the failed builds happen too often.

Does Jmdict work properly on IOS kindle app?

Hello,

I am a heavy user of your dictionary on kindle e-ink and i am thinking of getting a tablet, so i want to know if this part is also True for IOS kindle app.

"NOTE: Unfortunately the Kindle Android App does not support dictionary inflections, yielding verbs lookup practically impossible. No known workaround."

Thank you for the dictionary, awesome work!

Improve pitch diagram

I understand if this isn't something you want to consider, but I find the pitch notation used in the dictionary unintuitive and think it can be improved.

For example:
image
image

I think the reason I find this difficult is because I'm used to seeing the lines on the top rather than the bottom, so what's shown in this dictionary is the opposite of my expectation. For example, here's what Yomichan does:
image
image

Is it possible to show the lines on top as Yomichan does, and if so would you consider changing to this? (I feel like this is the more standard way to show it when using lines, so I suspect it's simply not possible, but if it is possible that would be great.)

Regardless of the above, would you consider adding in the other fairly standard notation of [X] in addition to the line notation (as shown in the above screenshots)? For example, [0] represents heiban and in all other cases the number in the brackets represents the mora after which the pitch drops. For those of us familiar with pitch accent, this notation is succinct and can be understood intuitively.

Combined dictionary does not have all features

Some people seem to prefer the combined dictionary not knowing that some features are missing. We either have to include all features by dropping entries (reduced size) or make it clearer the combined dictionary is missing some features

Thank you!

This is not a real issue, I just wanted to say (as somebody who works with jmdict a bit, that I really appreciate the work you've put into this.

Being able to study Japanese from bed or the pool without a distracting notification-enabled device is really terrific

Update release

Hello, are you able to post a release of the current version (with the ぐ inflection fix etc)? Thanks for making this btw, I use it all the time

Dictionary does not seem to build correctly (Linux)

Hi,

I've tried to build using make jmdict.mobi, here's the output I get:

wget -nv -N http://ftp.monash.edu.au/pub/nihongo/JMdict_e.gz
2020-06-30 12:09:49 URL:http://ftp.monash.edu.au/pub/nihongo/JMdict_e.gz [8510545/8510545] -> "JMdict_e.gz" [1]
wget -nv -N http://downloads.tatoeba.org/exports/sentences.tar.bz2
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
2020-06-30 12:10:28 URL:https://downloads.tatoeba.org/exports/sentences.tar.bz2 [133869927/133869927] -> "sentences.tar.bz2" [1]
wget -nv -N http://downloads.tatoeba.org/exports/jpn_indices.tar.bz2
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
2020-06-30 12:10:35 URL:http://downloads.tatoeba.org/exports/jpn_indices.tar.bz2 [2813686/2813686] -> "jpn_indices.tar.bz2" [1]
wget -nv -N https://kindlegen.s3.amazonaws.com/kindlegen_linux_2.6_i386_v2_9.tar.gz
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
2020-06-30 12:10:43 URL:https://kindlegen.s3.amazonaws.com/kindlegen_linux_2.6_i386_v2_9.tar.gz [10813137/10813137] -> "kindlegen_linux_2.6_i386_v2_9.tar.gz" [1]
tar -xzf kindlegen_linux_2.6_i386_v2_9.tar.gz kindlegen
touch kindlegen
python3 jmdict.py -a -s 5 -d j
Parsing JMdict_e.gz...
error: ばっちー[adj-i] should end with い, but ends with ー
error: ばっちぃ[adj-i] should end with い, but ends with ぃ
error: んとす[vs-i] should end with 為る/する, but ends with とす
error: むとす[vs-i] should end with 為る/する, but ends with とす
error: おいちー[adj-i] should end with い, but ends with ー
Created 188454 entries
Adding sentences...
Sentences added: 89419
Creating files for JMdict...
./kindlegen JMdict.opf -c1 -verbose -dont_append_source -o jmdict.mobi

*************************************************************
 Amazon kindlegen(Linux) V2.9 build 1028-0897292 
 A command line e-book compiler 
 Copyright Amazon.com and its Affiliates 2014 
*************************************************************

Info:I9006:option: -c1: Standard DOC compression
Info:I9014:option: -verbose: Verbose output
Info:I9018:option: -donotaddsource: Source files will not be added
Info(prcgen):I1047: Added metadata dc:Title        "JMdict Japanese-English Dictionary"
Info(prcgen):I1047: Added metadata dc:Date         "2019-05-08"
Info(prcgen):I1047: Added metadata dc:Creator      "Electronic Dictionary Research & Development Group"
Info(prcgen):I1002: Parsing files  0000245
Info(prcgen):I1003: Parsing file     URL: JMdict-frontmatter.html
Info(prcgen):I1003: Parsing file     URL: entry-JMdict-あ.html
Warning(parser8):W26001: Index not supported for enhanced mobi.
Info(prcgen):I1003: Parsing file     URL: entry-JMdict-い.html
Info(parser8):I12001: Enhanced mobi generation suppressed.
Info(prcgen):I1036: Mobi file built successfully

You will notice that the last file parsed is entry-JMdict-い.html but I would expect it to be entry-JMdict-ン.html. Am I missing something?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.