This Assignment involves:
- Convertng the dataset 'data.csv' into balanced class dataset.
- Creating five samples using five different sampling techniques.
- Applying five different ML models on them.
- Determining which sampling technique gives higher accuracy on which model.
- Simple Random Sampling- Pick the sample at random
- Systematic Sampling- Samples are chosen at random intervals.
- Stratified Sampling- The population is divided into subgroups or strata based on a certain characteristic. Individual elements from a sub-population can be randomly selected.
- Cluster Sampling- The entire population is divided into smaller groups and then a random sample of these clusters is selected. The sample size is then selected on the basis of sample size.
- KNN
- Logistic Regression
- Naive Bayes
- Support Vector Machine
- Decision Tree