my $cl = AI::Classifier::Text->new(
training_data => [
{
attributes => _hash(qw(sheep very valuable farming)),
labels => ['farming']
},
{
attributes => _hash(qw(farming requires many kinds animals)),
labels => ['farming']
},
{
attributes => _hash(qw(vampires drink blood vampires may staked)),
labels => ['vampire']
},
);
# the above creates a default AI::NaiveBayes classifier and feeds it the training data
my $res = $cl->classify("I would like to begin farming sheep" );
$res = $cl->classify("I would like to begin farming sheep", { new_user => 1 });
print $res->best_category;
$cl->store('some-file');
# later
my $cl = AI::Classifier::Text->load('some-file');
my $res = $cl->classify("do cats eat sheep?");
AI::Classifier::Text
combines a lexical analyzer (by default being AI::Classifier::Text::Analyzer) and a compatible classifier to perform text classification.
The constructor requires either a compatible trained classifier (like AI::NaiveBayes
) - or training_data parameter with a list of training examples. In that later case it creates the default AI::NaiveBayes
classifier and traubs it before constructing the AI::Classifier::Text
object.
If your training data does not feet into the computer memory, or you want a different classifier to use - than train the classifier first and then pass it to the AI::Classifier::Text
constructor.
This is partially based on AI::TextCategorizer
.
classifier
-
An object that'll perform classification of supplied feature vectors. Has to define a
classify()
method, which accepts a hash refence. The return value of AI::Classifier::Text->classify() will be the return value ofclassifier
'sclassify()
method.This attribute has to be supplied to the
new()
method during object creation. analyzer
-
The class performing lexical analysis of the text in order to produce a feature vector. This defaults to
AI::Classifier::Text::Analyzer
.
new(classifier => $foo)
-
Creates a new
AI::Classifier::Text
object. It requires either the classifier or the training data passed to it. classify($document, $features)
-
Categorize the given document. A lexical analyzer will be used to extract features from
$document
, and in addition to that the features from$features
hash reference will be added. The return value comes directly from theclassifier
object'sclassify
method.
AI::NaiveBayes (3), AI::Categorizer(3)