Git Product home page Git Product logo

aivivn-vn-diacritic's Introduction

Vietnamese Diacritic Restoration

Vietnamese Diacritic Restoration using Transformer Sequence-to-Sequence Moddel

Requirement

tensorflow-gpu==1.14.0
tensor2tensor==1.14.1

Generate data

Generate data for default problem translate_vndt

./gen_data.sh

Generate data for custom problem A

./gen_data.sh A

Train model

Define problem

@registry.register_problem
class TranslateVndt(translate.TranslateProblem):
    @property
    def approx_vocab_size(self):
        return 2**15  # 32768

    def source_data_files(self, dataset_split):
        train = dataset_split == problem.DatasetSplit.TRAIN
        return _VNDT_TRAIN_DATASETS if train else _VNDT_DEV_DATASETS

Define hyperparams

@registry.register_hparams
def transformer_base_h256():
    hparams = transformer_base()
    hparams.hidden_size = 256
    return hparams

On problem translate_vndt, to train model transformer with hparams transformer_base on GPUs 0,1

./train.sh 0,1 transformer_base transformer translate_vndt

Predict

Similar to train.sh

./predict.sh 0,1 transformer_base transformer translate_vndt

The output is stored in sub-translate_vndt-transformer-transformer_base.csv

aivivn-vn-diacritic's People

Contributors

hqphat avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

git04112019

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.