Git Product home page Git Product logo

Comments (7)

guillaumekln avatar guillaumekln commented on May 21, 2024 2

Hello,

I don't have specific experience in spell checkers, but if you can model it as a classic sequence to sequence model, then yes you can use OpenNMT-tf as is. Otherwise, the code should be friendly enough to customize it.

from opennmt-tf.

jsenellart avatar jsenellart commented on May 21, 2024 1

Hello, you may want to check this as an entry point: http://nlp.seas.harvard.edu/papers/aesw2016.pdf

from opennmt-tf.

 avatar commented on May 21, 2024 1

Hello,
I understand that this is an old topic now, but I just wanted to add a slight note on this.
With the BERT paper, and with this Seq2Seq library, it is possible to create a Spell Checker using openNMT by the introduction of "masking". In BERT they mask some words, and they replace some other words with random words. Using the same approach, we can replace some characters with other character on purpose, to teach the model to fix them. Using 12 enc-dec layers as CNN did the trick for me.

from opennmt-tf.

 avatar commented on May 21, 2024 1

This is the BERT paper I'm referring to.
Basically the concept that I derived from the paper (it is not exactly how it's done in the paper, but it is what I did) is to purposefully replace some characters with wrong ones. For example:
I love to play with my cat
would turn into characters
I l o v e t o p l a y w i t h m y c a t
Then I would create 5 train cases from this sentence:
I l o v r t o p l e y w i h h m y c a t
I l u v e t u p l a a w e t h m y c a t
... etc

The target for these 5 sentences would be
I love to play with my cat
Which is the same sentence we began with. This way the model will learn how to deduce sentences on its own, and at the same time it'll learn to fix some characters.
For me the example was character to words, but you can word to word at a character level. Example:
I love to play with my cat
becomes
I <sep> l o v e <sep> t o <sep> p l a y <sep> w i t h <sep> m y <sep> c a t <sep> <eos>
this way you teach the model to only fix the word with itself, instead of trying to fix the sentence. It will depend on your case. Let me know if you need more clarification.

from opennmt-tf.

mzeidhassan avatar mzeidhassan commented on May 21, 2024

Thanks @guillaumekln and @jsenellart for your replies and advice. I appreciate it.

from opennmt-tf.

mzeidhassan avatar mzeidhassan commented on May 21, 2024

Thanks @ridhwan-saal for your update. Do you have code to share? This sounds really interesting. Thanks again for your useful update. Maybe, you should write a Medium article about it. It would be great if you can share the paper you are referring to.

from opennmt-tf.

mzeidhassan avatar mzeidhassan commented on May 21, 2024

Thanks a million @ridhwan-saal. I really appreciate taking the time to give such a great and detailed explanation.

from opennmt-tf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.