Git Product home page Git Product logo

spark_project_web_graphs's Introduction

This is a project to run web graphs on Spark using hadoop cluster
The project has been done by 4th year students of IIIT Dharwad
Team Members:
Ayush Singh
Avinash Tiwari and
Nachiket Ganesh Apte

To use the project, setup hadoop cluster of size of your choice.
Note - install java 8 as hadoop doesn't suport modern versions of java.
Then clone this repository and open it in intellij or similar ide which support java with maven.
Now, modify the POM.xml file and update the hadoop and spark version according to your versions. Other dependencies also need to be updated accordingly.

next, download the required datasets and remove any information from it except the adjacency list.
now, change paths in the classes accordingly to indicate your datasets

now, run mvn clean package to build the jar file of all the classes.
after that run the spark submit command as given in the commands_used.txt.
Note - don't forget to change the path of the jar file accordingly.


now, launch your browser and type master:8088 (where master represents the ip of your master node)
Also open master:9870 to view status of no of live nodes, files stored in hdfs file system, etc.
upon submitting, you can see the application in master:8088 with its status.

spark_project_web_graphs's People

Contributors

nachiketapte1404 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.