two city (Hong Kong and Singapore) hot information comparison
Hi,everyone.This project was completed when I studied the machine learning naive Bayes algorithm. Now let me give you some explanation of the project.
(1)data collection:To collect content from RSS feeds, you need an RSS source to bulid an interface.
(2)data preparation:Parse the text file into the entry vector.
(3)data analysis:Check entries to ensure the correctness of the parsing.
(4)training algorithm:Using the trainNB0(trainMatrix, trainCategory) function.
(5)testing algorithm:Observe the error rate to ensure the classifier is available.
(6)Using algorithm:Build a complete program that encapsulates everything.Given two RSS feeds,the program
displays the most reused common words.
That's all.Enjoying!