sabareesh19 / sort-on-hadoop-spark Goto Github PK
View Code? Open in Web Editor NEWSorting of large dataset files(80GB) using Hadoop(Mapreduce) techniques and Apache Spark in Java and scheduled job on the virtual cluster(using 4 nodes) using a SLURM scheduler with bash scripting