Git Product home page Git Product logo

koparadigm's Introduction

KoParadigm: A Korean Conjugation Paradigm Generator

This is the offical repo for our paper: KoParadigm: A Korean Conjugation Paradigm Generator

(Inflectional) paradigm means the set of all the inflected forms of a word. For example, English verb "look" has inflected forms like "look", "look-s", "look-ed", and "look-ing", as all of you know. Paradigms are widely used in corpus linguistics or search engines. To create the full paradigm set of a language is sometimes tricky. It is particularly so when we deal with a morphologically rich language like Korean. Inflection of Korean verbs is notorisouly complicated. Typically, a Korean verb can combine with more than 100 endings. What is worse, the combination rules are not simple at all. They are determined by the sound of the verb/ending, and the part-of-speech of the verb (action / descriptive). That's why so far there's no open sources of Korean paradigm generator, I think. Here's the first one. With KoParadigm, you can easily get the full paradigm of a Korean verb.

Dependencies

  • python >=3.6
  • jamo >=0.4.1
  • xlrd == 1.2.0

Installation

pip install koparadigm

Usage

>>> from koparadigm import Paradigm, prettify
>>> p = Paradigm()
>>> verb = "곱" # Note that you must drop the final ending 다
>>> paradigms = p.conjugate(verb) # this returns list of lists
>>> print(paradigms)
[['Action Verb', [('거나', '곱거나'), ('거늘', '곱거늘'), ('거니', '곱거니') ...]]]
>>> prettify(paradigms)
POS = Action Verb
• ending = 거나 form = 곱거나
• ending = 거늘 form = 곱거늘
• ending = 거니 form = 곱거니
...
==================== 2 ====================
POS = Descriptive Verb
• ending = 거나 form = 곱거나
• ending = 거늘 form = 곱거늘
• ending = 거니 form = 곱거니
• ending = 거니와 form = 곱거니와
...

References

If you use our software for research, please cite:

@article{park2020KoParadigm,
  author = {Park, Kyubyong },
  title={KoParadigm: A Korean Conjugation Paradigm Generator},
  journal={arXiv preprint arXiv:2004.13221},
  year={2020}
}

koparadigm's People

Contributors

kyubyong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

koparadigm's Issues

excel xlsx file(koparadigm.xlsx); not supported in latest xlrd(version 2.0.1)

약 4~5시간전에 xlrd 라이브러리의 버전이 1.2.0 에서 2.0.1 로 업데이트되었습니다.
최신 버전의 xlrd 라이브러리를 사용할 경우에 아래와 같은 오류가 발생하여 이슈를 등록합니다.

  • xlrd.biffh.XLRDError: Excel xlsx file; not supported

더이상 xlsx 을 지원하지 않는 것으로 보입니다.

이미 동작이 검증된 xlrd 버전(1.2.0) 으로 고정을 했으면 하는 의견을 남깁니다.

REQUIRED_PACKAGES = [
    'jamo',
    'xlrd==1.2.0',
]

Some patterns are missing?

It seems some patterns are missing?

What I found is "오르" (pattern 25) does not have the pattern for past tense. Is this intentionally?

=> Sorry, this was my misunderstanding. I cloud find the pattern also. Closed.

feature request - reverse conjugation

hi folks,

i've been looking for a library that performs the following:

먹었습니다 # => 먹다
할게요 # => 하다

are there plans to support something like this? alternatively, do you think leveraging this library would be useful to build a reverse conjugator myself?

cheers,
Ryan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.