abhilash-1 / pyspark-project Goto Github PK
View Code? Open in Web Editor NEWThis is the first project where we worked on apache spark, In this project what we have done is that we downloaded the datasets from KAGGLE where everyone is aware of, we have downloaded loan, customers credit card and transactions datasets . After downloading the datsaets we have cleaned the data . Then after by using new tools and technologies like spark, HDFS, Hive and many more we have executed new use cases on the datasets we have downloaded from kaggle. As we all know apache spark is a framework that can quickly process the very large datsets.