Use Bayesian classification and K-means clustering to analyze SMS messages.
This repo contains two IPython (Jupyter) Notebooks reflecting my work on this project. sms2.ipynb contains the second and better attempt at classifying spam. sms-spam.ipynb is still here for my own reference, or for you if you want to watch me fail. :)
Download the SMS Spam collection from the UCI Machine Learning Repository.
Choose a set of features to use in order to separate SMS ham from spam.
Write a program to extract the features you want from each SMS message and then classify each SMS as ham or spam. Iterate on your feature extraction until you have a classification success level you are comfortable with (> 75% minimum.)