Git Product home page Git Product logo

nmt's Introduction

This repo is dedicated to the implementation of Neural machine translation(NMT) encoder-decoder with using of Keras. This model takes input sentences which taken from http://www.manythings.org/anki/ and translate it to the corresponsing sentence and the Bleu score which taken from the run on the test set is 0.509124.

This model has two main section which is Encoder and Decoder:

Encoder : This section refers to the input text corpus which is German text in the form of embedding vectors and trains the model.

Decoder: This part of the model translates ands predicts the input embedding vectors into one-hot vectors which respresents English words in the dictionary.

model

Requirements: For this projects it requried to have the following python packages: Numpy,nltk and kers beside all of them it required to install tensorflow because without it Keras will not work.

In terms of the dataset which used taken from http://www.manythings.org/anki/ and once you entered to this URL there are bunch texts you need to download the deu.txt which contain the pairs of English-German sentences. And this dataset contains 1,52,820 pairs of English to German phrases.

The first phase of this project is dedicating to preprocessing of the dataset, so in order to do that you need to first run the pre-process.py in order to clean the data and then run prepare_dataset.py to break the dataset into smaller trainig and testing dataset. Once this two python ran they will generate three pickle file which are english-german-both.pkl, english-german-train.pkl and english-german-test.pkl.

The preprocessing of the data involves:

Removing punctuation marks from the data.
Converting text corpus into lower case characters.
Shuffling the sentences as sentences were previously sorted in the increasing order of their length.

Training the Encoder-Decoder LSTM model

Run model.py to train the model. After successful training, the model will be saved as model.h5 in your current directory.

This model uses Encoder-Decoder LSTMs for NMT. In this architecture, the input sequence is encoded by the front-end model called encoder then, decoded by backend model called decoder.
It uses Adam Optimizer to train the model using Stochastic Gradient Descent and minimizes the categorical loss function.

Evaluating the model

Run evaluate_model.py to evaluate the accuracy of the model on both train and test dataset.

It loads the best saved model.h5 model.
The model performs pretty well on train set and have been generalized to perform well on test set.
After prediction, we calculate Bleu scores for the predicted sentences to check how well the model generalizes.

Calculating the Bleu scores

BLEU (bilingual evaluation understudy) is an algorithm for comparing predicted machine translated text with the reference string given by the human. A high BLEU score means the predicted translated sentence is pretty close to the reference string. More information can be found here. Below are the BLEU scores for both the training set and the testing set along with the predicted and target English sentence corresponding to the given German source sentence.

nmt's People

Contributors

eddieir avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.