This respository contains assignments of Big Data Analysis with Scala and Spark in coursera. It is very useful to start learning scala and spark.
Processing Wikipedia Data to rank programming language words such as scala,etc.
Data wikipedia.dat : http://alaska.epfl.ch/~dockermoocs/bigdata/wikipedia.dat
Kmeans Clustering on Stackoverflow Data using Spark Scala.
Data stackoverflow.csv : http://alaska.epfl.ch/~dockermoocs/bigdata/stackoverflow.csv
Spark SQL Analysis of American Time Use Survey
The dataset atussum.csv is provided by Kaggle and is documented here https://www.kaggle.com/bls/american-time-use-survey/data.