Git Product home page Git Product logo

Comments (4)

jesus-seijas-sp avatar jesus-seijas-sp commented on May 16, 2024 2

Well, SQuAD dataset contains more than 100k pairs of question and answer. The target of NLP.js is to more modest datasets. I didn't do the test, but for training the logistic regression classifier with 100K intents and the amount of features (different words) of SQuAD, is for sure too much for training with CPU, and only the matrix of features and labels can be too much.

Perhaps with tensorflow, NLTK and with GPUs can be a better approach. Who knows, perhaps in the future we will go to the path of tensorflow.js and NLTK, but right now is not in the roadmap.

from nlp.js.

MrPeker avatar MrPeker commented on May 16, 2024

I tried ~100K Q&A train with AWS 32GB Memory 8 vCPU 75 GB SSD Space, but didn't create model.nlp file and didn't show any error, no error, no model.nlp :/ I don't know problem. When ı training dataset memory is ~16GB level.

from nlp.js.

jesus-seijas-sp avatar jesus-seijas-sp commented on May 16, 2024

We have an example supporting 10.000 intents, extracted from SQuAD v2.
Open Question with BERT is added, but with it needs a python app to work.

Right now I'm working on doing OpenQuestion fully javascript. The first problem is that tokenizers only work in node 11 or 13 and is a bridge to rust, I solved that by building BERT word piece tokenizer by myself. Now I'm working on the tenworflow open question model and runtime

from nlp.js.

jesus-seijas-sp avatar jesus-seijas-sp commented on May 16, 2024

Closing as integration with BERT API is provided

from nlp.js.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.