Git Product home page Git Product logo

sv-sv's Introduction

sv-sv

About

Creates a Swedish-Swedish dictionary for Kindle, starting with LEXIN from http://spraakbanken.gu.se/

Kindle Previewer is needed for converting the dictionary to a MOBI file.

Usage

To generate the dictionary as HTML, simply run make with no options in this directory.

$ make

The created HTML file svsv.html is referenced in svsv.opf so now we can create the MOBI file using Kindle Previewer.

Install Kindle Previewer, open it and open the svsv.opf file in the Previewer.

Export to MOBI in the menu: File -> Export and select MOBI as the file format.

Sideload the dictionary onto the E-reader by putting the dictionary MOBI file in the Documents folder of the E-reader.

The dictionary should now be seen in the list of dictionaries on the E-reader: Settings -> All Settings -> Language and Dictionaries > Dictionaries and will be used for books in the dictionary language.

sv-sv's People

Contributors

perkl avatar gostaj avatar epatel avatar

Stargazers

Fredrik Lundhag avatar Kemie avatar  avatar Marcel Derosier avatar Diego Moreno avatar  avatar Daniel G. Killacky, Jr. avatar  avatar DrJobel avatar  avatar  avatar Stefan Puhlmann avatar Tobbe avatar  avatar  avatar Mikhail Glushenkov avatar

Watchers

 avatar

sv-sv's Issues

Unicode Issue with the transform file.

Hey I got this error when I tried to run your script:

python transform.py > svsv.html Traceback (most recent call last): File "transform.py", line 47, in <module> if(definition != None):sys.stdout.write(definition.text) UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128) make: *** [svsv.html] Error 1

The I did the following fix which seems to have fixed the issue:

`
import xml.etree.ElementTree as ET
import sys

tree = ET.parse('lexin_utf8.xml')
root = tree.getroot()

sys.stdout.write("""

""")

for lemma in root.iter('lemma-entry'):
form = lemma.find('form')
pronunciation = lemma.find('pronunciation')
inflection = lemma.find('inflection')
pos = lemma.find('pos')

sys.stdout.write("<idx:entry>")
sys.stdout.write("<idx:orth>")
sys.stdout.write(("<b>"+form.text.replace('~','')+"</b> ").encode('utf-8'))

if(inflection != None and inflection.text != None and len(inflection.text)!=0):
    sys.stdout.write("<idx:infl>")
    for s in inflection.text.split(' '):
        sys.stdout.write("<idx:iform value=\"")
        sys.stdout.write((s).encode('utf-8'))
        sys.stdout.write("\" />")
    sys.stdout.write("</idx:infl>")

lexemes = lemma.findall('lexeme')
makelist = len(lexemes)>1
if(makelist): sys.stdout.write("<ol>")
for lexeme in lexemes:
    lexnr = lexeme.find('lexnr')
    definition = lexeme.find('definition')
    usage = lexeme.find('usage')
    comment = lexeme.find('comment')
    valency = lexeme.find('valency')
    grammat_comm = lexeme.find('grammat_comm')
    definition_comm = lexeme.find('definition_comm')
    examples = lexeme.findall('example')
    idioms = lexeme.findall('idiom')
    compounds = lexeme.findall('compound')
    
    if(makelist): sys.stdout.write("<li>")
    if(definition != None):sys.stdout.write((definition.text).encode('utf-8'))
    if(makelist): sys.stdout.write("</li>")
     
if(makelist): sys.stdout.write("</ol>")
else: sys.stdout.write("<br>")

sys.stdout.write("</idx:orth>")
sys.stdout.write("</idx:entry>")

sys.stdout.write("""

""")

sys.stdout.write("\n")
`

Tried to push a PR but your repo does not allow that access, Let me know if that was helpful!

Maila ordboken?

Hej Per,

Kan inte kommentera på din blogg och be dig maila den färdiga ordboken, eftersom jag varken har kunskap eller rätt typ av verktyg.
Kan du mail den till mig vore jag tacksam.
[email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.