Git Product home page Git Product logo

engtokor-transliterator's Introduction

English-Korean transliterator (영-한 음차 변환기)

이 프로젝트는 영어 단어를 한글 발음 표기로 변환하는 프로그램입니다. (e.g. transformer트랜스포머)

허깅페이스Text2Text Generation Task 의 사전학습 언어 모델을 사용하였습니다.

Fine-tuning 한 음차 변환 모델은 여기에서 확인하실 수 있습니다.


Prerequisites

$ pip install -r requirements.txt

CUDA 버전에 맞는 torch 의 설치가 필요합니다.

# CUDA 11.3
$ pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

Usage

ver1. Train model

$ python3 transliteration.py --train
('>> total number of data:', 56699)
('>> number of train data:', 51029)
('>> number of test data:', 5670)
('> Preprocessing',)
('> Preprocessing',)
('> Train Model Start...',)
INFO:simpletransformers.t5.t5_model: Training started

ver2. Use pre-trained language model

$ python3 transliteration.py --test
('> Pretrained Model Start...',)
Generating outputs: 100%|██████████████████████████████████████| 1/1 [00:00<00:00, 10.93it/s]
Decoding outputs: 100%|██████████████████████████████████████| 4/4 [00:00<00:00,  6.20it/s]
machinelearning :       머신러닝
deeplearning    :       딥러닝
transformer     :       트랜스포머
attention       :       어텐션
$ python3 transliteration.py --decode
종료는 'q' 입니다.
>> transformer
('> Pretrained Model Start...',)
Generating outputs: 100%|██████████████████████████████████████| 1/1 [00:00<00:00, 10.81it/s]
Decoding outputs: 100%|██████████████████████████████████████| 1/1 [00:00<00:00,  1.47it/s]
트랜스포머
>> kakao
Generating outputs: 100%|██████████████████████████████████████| 1/1 [00:00<00:00, 9.71it/s]
Decoding outputs: 100%|██████████████████████████████████████| 1/1 [00:00<00:00,  5.78it/s]
카카오
>>

Run Gradio app

$ python transliteration.py --gradio
Running on local URL:  http://127.0.0.1:7860/

To create a public link, set `share=True` in `launch()`.

References

engtokor-transliterator's People

Contributors

eunsour avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.