Git Product home page Git Product logo

naivebayes's Introduction

build status code coverage code style styled with prettier made with lass npm downloads

A ladjs naivebayes package forked from surmon-china/naivebayes

Table of Contents

What can I use this for

Naive-Bayes classifier for JavaScript.

naivebayes takes a document (piece of text), and tells you what category that document belongs to.

You can use this for categorizing any text content into any arbitrary set of categories. For example:

  • Is an email spam, or not spam ?
  • Is a news article about technology, politics, or sports ?
  • Is a piece of text expressing positive emotions, or negative emotions?

Install

npm

npm install @ladjs/naivebayes

yarn

yarn add @ladjs/naivebayes

Usage

const NaiveBayes = require('naivebayes')

const classifier = new NaiveBayes()

// teach it positive phrases
classifier.learn('amazing, awesome movie!! Yeah!! Oh boy.', 'positive')
classifier.learn('Sweet, this is incredibly, amazing, perfect, great!!', 'positive')

// teach it a negative phrase
classifier.learn('terrible, cruddy thing. Damn. Sucks!!', 'negative')

// now ask it to categorize a document it has never seen before
classifier.categorize('awesome, cool, amazing!! Yay.')
// => 'positive'

// serialize the classifier's state as a JSON string.
const stateJson = classifier.toJson()

// load the classifier back from its JSON representation.
const revivedClassifier = NaiveBayes.fromJson(stateJson)
const NaiveBayes = require('naivebayes')

const Segment = require('segment')
const segment = new Segment()

segment.useDefault()

const classifier = new NaiveBayes({

    tokenizer(sentence) {

        const sanitized = sentence.replace(/[^(a-zA-Z\u4e00-\u9fa50-9_)+\s]/g, ' ')

        return segment.doSegment(sanitized, { simple: true })
    }
})

API

Class

const classifier = new NaiveBayes([options])

Returns an instance of a Naive-Bayes Classifier.

Options

  • tokenizer(text) - (type: function) - Configure your own tokenizer.
  • vocabularyLimit - (type: number default: 0) - Reference a max word count where 0 is the default, meaning no limit.
  • stopwords - (type: boolean default: false) - To remove stopwords from text

Eg.

const classifier = new NaiveBayes({
    tokenizer(text) {
        return text.split(' ')
    }
})

Learn

classifier.learn(text, category)

Teach your classifier what category the text belongs to. The more you teach your classifier, the more reliable it becomes. It will use what it has learned to identify new documents that it hasn't seen before.

Probabilities

classifier.probabilities(text)

Returns an array of { category, probability } objects with probability calculated for each category. Its judgement is based on what you have taught it with .learn().

Categorize

classifier.categorize(text ,[probability])

Returns the category it thinks text belongs to. Its judgement is based on what you have taught it with .learn().

ToJson

classifier.toJson()

Returns the JSON representation of a classifier. This is the same as JSON.stringify(classifier.toJsonObject()).

ToJsonObject

classifier.toJsonObject()

Returns a JSON-friendly representation of the classifier as an object.

FromJson

const classifier = NaiveBayes.fromJson(jsonObject)

Returns a classifier instance from the JSON representation. Use this with the JSON representation obtained from classifier.toJson().

Debug

To run naivebayes in debug mode simply set DEBUG=naivebayes when running your script.

Contributors

Name Website
Surmon http://surmon.me/
Shaun Warman https://shaunwarman.com/

naivebayes's People

Contributors

shaunwarman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.