Project #1 for STAT 517 at the University Of Idaho. The project uses machine learning algorithms to predict labels on unlabled datset
Problem #2 uses various classification methods to determine if a specific job posting on Glassdoor.com will have a salary above 50k or below 50k. Dataset used to build model has case imbalance so area under the ROC curve is used to determine optimal classification model.
Problem #3 uses various classification methods to determine the likelihood that a customer purchased a particular insurance policy. This problem has an even more extreme case imbalance problem than the previous question.