- About
- Data Understanding and Exploring
- Data Preparation
- Data Modeling
- Model Evaluation and Conclusion
- Acknowledgments
This project is based on real spending data for three months. Using this data, we applied three techniques, such as Cluster, Association, and neural network analysis. We were trying to solve the following questions.
- What are the categories and stores that we frequently purchase our daily stuffs?
- How is our buying pattern and What are the items that we most likely buy together?
- How can we optimize our spending on grocery?
This is a metadata of the data set.
The above histogram represents the amount spent on different merchandise category across various stores.
I used K-means clustering to classify the data based on category, store name, items, and amount.
Above histogram shows the most frequent items in the data set.
We created Association Rules: min support as 0.01, confidence as 0.05 and sorted with βhigh-confidenceβ rule. The rules with confidence of 1 imply that, whenever the LHS item was purchased, the RHS item was also purchased 100% of the time.
After splitting data set into training and testing set, run the neural network to see if the model is fit.
Above matrix shows that out of 59 Needs 55 were predicted correctly, and out of 21 want 20 were predicted correctly.
The neural net model is used to predict our spending behviour and optimize our budget by loading data for April month.
The above output shows that when we purchase Red onions from Columbia Store it is classified as our Needs, meaning the Red onion is something we canβt avoid buying.
Based on our research, Our grocery spending can be optimized. We can also see which items are tied together and check our neural net model to see if that particular item set falls under our needs or wants. We can avoid our wants and only focus on our needs.
I would like to thank Sushil Tiwari for the contribution on this project.