Git Product home page Git Product logo

distributedtensorflowalexnet's Introduction

CS 494 - Cloud Data Center Systems

Homework 2

HW2 Report Link:

HW2 Project Report

Setup Script

In order to properly set up the Tensorflow framework for the given network configuration (cluster of 4 machines) do what follows:

  • Clone this repository on your local machine

  • Set up locally the following alias for the cluster machines in the ~/.ssh/config file:

    Host nodei Hostname <nodei_IP>

    Where nodei is something like node0,node1 ...

  • Run init.sh username . Where username is your name on CloudLab. The script will update the system and install the required packages.

In our case the ~/.ssh/config file will be something like:

Host node0
        HostName node0_id_code.cloudlab.us

Host node1
        HostName node1_id_code.cloudlab.us

Host node2
        HostName node2_id_code.wisc.cloudlab.us

Host node3
        HostName node3_id_code.wisc.cloudlab.us

OBS: Do not use the Utah cluster, it seems to have problems with tensorflow and python3

Running the experiments

To run a given experment some useful scripts are given.

  • To run the logistic regression model in asynchronous mode do the following: run-scripts/run-task1-cluster.sh username
  • To run the logistic regression model in synchronous mode do the following: run-scripts/run-task2.sh username
  • To run AlexNet in distribute mode do the following: cd alexnet/alexnet && ./startservers.sh username mode (Where mode is either single, cluster or cluster2)

The output of the run will be logged locally. If you want to profile the experiments using dstat just append to each of the above mentioned commands the '-profile' flag. The output of the proifling will be stored in an appropriate directory.

distributedtensorflowalexnet's People

Contributors

claudiomontanari avatar komalshinde27 avatar

Stargazers

 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.