Git Product home page Git Product logo

pytrovich's Introduction

Pytrovich

pytrovich is a Python 3.6+ port of petrovich library which inflects Russian names to a given grammatical case. It supports first names, last names and middle names inflections. Since version 0.0.2, gender detection is also available.

petrovich-java was the main inspiration.

The alternative (earlier) port: Petrovich (@alexeyev was not aware of it at the time of porting petrovich to Python). The only meaningful difference we have found is that it does not support gender detection.

Python 3x PyPI version Downloads

Installation

Should be as simple as that

pip install pytrovich

Usage

Inflection

from pytrovich.enums import NamePart, Gender, Case
from pytrovich.maker import PetrovichDeclinationMaker

maker = PetrovichDeclinationMaker()
print(maker.make(NamePart.FIRSTNAME, Gender.MALE, Case.GENITIVE, "Иван"))  # Ивана
print(maker.make(NamePart.LASTNAME, Gender.MALE, Case.INSTRUMENTAL, "Иванов"))  # Ивановым
print(maker.make(NamePart.MIDDLENAME, Gender.FEMALE, Case.DATIVE, "Ивановна"))  # Ивановне

Gender detection

from pytrovich.detector import PetrovichGenderDetector

detector = PetrovichGenderDetector()
print(detector.detect(firstname="Иван"))  # Gender.MALE
print(detector.detect(firstname="Иван", middlename="Семёнович"))  # Gender.MALE
print(detector.detect(firstname="Арзу", middlename="Лутфияр кызы"))  # Gender.FEMALE

Custom rule file

You can replace default rules file with some custom one. Only JSON format is supported.

maker = PetrovichDeclinationMaker("/path/to/custom/rules.file.json")

E.g. if pytrovich fails on PetrovichDeclinationMaker creation, one may consider downloading rules.json directly from petrovich-rules repo as a fix (please create an issue if that actually happens).

How to cite

Not neccessary, but greatly appreciated, if you use this work.

@misc{Pytrovich,
  title     = {{petrovich/pytrovich: Python3 port of Petrovich, an inflector for Russian anthroponyms}},
  year      = {2020},
  url       = {https://github.com/petrovich/pytrovich},
  language  = {english},
}

More info

For more information on the project please refer to other petrovich repos.

TODO

  • efficiency was not a top priority, the time has come for faster algorithms, RegEx and data structures
  • evaluation based on petrovich-eval

License

This project is available under MIT license.

pytrovich's People

Contributors

alexeyev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytrovich's Issues

Bug with Dative case

Имя Ольга в Родительном падеже вместо Ольги выдаёт Ольгы

Hyphenated lastname support

In Cyrillic we have many double lastnames.

This includes some simple cases such as:

  • Петров-Водкин
  • Бестужев-Марлинский

Female lastname also can be double, but I can not remember any famous one at the time.

These can be handled by naive code: split the lexem by hyphen, translate each part, join back with hyphen.
However, this becomes nasty with exceptions, including Бонч-Бруевич, Мамин-Сибиряк, Муравьёв-Апостол.
Can this be supported by pytrovich rules model?

Here is the list of the most famous names.

Лев Толстой в родительном падеже - ОШИБКА

Лев Толстой в родительном падеже (GENITIVE) почему-то преобразуется в Лева Толстого, а не в Льва Толстого.
На демонстрационном сайте (petrovich.nlpub.ru) исходной библиотеки результат правильный.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.