A ladjs naivebayes package forked from surmon-china/naivebayes
Naive-Bayes classifier for JavaScript.
naivebayes
takes a document (piece of text), and tells you what category that document belongs to.
You can use this for categorizing any text content into any arbitrary set of categories. For example:
- Is an email spam, or not spam ?
- Is a news article about technology, politics, or sports ?
- Is a piece of text expressing positive emotions, or negative emotions?
npm install @ladjs/naivebayes
yarn add @ladjs/naivebayes
const NaiveBayes = require('naivebayes')
const classifier = new NaiveBayes()
// teach it positive phrases
classifier.learn('amazing, awesome movie!! Yeah!! Oh boy.', 'positive')
classifier.learn('Sweet, this is incredibly, amazing, perfect, great!!', 'positive')
// teach it a negative phrase
classifier.learn('terrible, cruddy thing. Damn. Sucks!!', 'negative')
// now ask it to categorize a document it has never seen before
classifier.categorize('awesome, cool, amazing!! Yay.')
// => 'positive'
// serialize the classifier's state as a JSON string.
const stateJson = classifier.toJson()
// load the classifier back from its JSON representation.
const revivedClassifier = NaiveBayes.fromJson(stateJson)
const NaiveBayes = require('naivebayes')
const Segment = require('segment')
const segment = new Segment()
segment.useDefault()
const classifier = new NaiveBayes({
tokenizer(sentence) {
const sanitized = sentence.replace(/[^(a-zA-Z\u4e00-\u9fa50-9_)+\s]/g, ' ')
return segment.doSegment(sanitized, { simple: true })
}
})
const classifier = new NaiveBayes([options])
Returns an instance of a Naive-Bayes Classifier.
tokenizer(text)
- (type:function
) - Configure your own tokenizer.vocabularyLimit
- (type:number
default: 0) - Reference a max word count where0
is the default, meaning no limit.stopwords
- (type:boolean
default: false) - To remove stopwords from text
Eg.
const classifier = new NaiveBayes({
tokenizer(text) {
return text.split(' ')
}
})
classifier.learn(text, category)
Teach your classifier what category
the text
belongs to. The more you teach your classifier, the more reliable it becomes. It will use what it has learned to identify new documents that it hasn't seen before.
classifier.probabilities(text)
Returns an array of { category, probability }
objects with probability calculated for each category. Its judgement is based on what you have taught it with .learn()
.
classifier.categorize(text ,[probability])
Returns the category
it thinks text
belongs to. Its judgement is based on what you have taught it with .learn()
.
classifier.toJson()
Returns the JSON representation of a classifier. This is the same as JSON.stringify(classifier.toJsonObject())
.
classifier.toJsonObject()
Returns a JSON-friendly representation of the classifier as an object
.
const classifier = NaiveBayes.fromJson(jsonObject)
Returns a classifier instance from the JSON representation. Use this with the JSON representation obtained from classifier.toJson()
.
To run naivebayes
in debug mode simply set DEBUG=naivebayes
when running your script.
Name | Website |
---|---|
Surmon | http://surmon.me/ |
Shaun Warman | https://shaunwarman.com/ |