Git Product home page Git Product logo

cs182_hw2_student's Introduction

CS182/282 Assignment 2

In this assignment you will implement recurrent networks, and apply them to image captioning on Microsoft COCO. You will also explore methods for visualizing the features of a pretrained model on ImageNet, and also this model to implement Style Transfer. The goals of this assignment are as follows:

  • Understand the architecture of recurrent neural networks (RNNs) and how they operate on sequences by sharing weights over time
  • Understand and implement both Vanilla RNNs and Long-Short Term Memory (LSTM) RNNs
  • Understand how to sample from an RNN language model at test-time
  • Understand how to combine convolutional neural nets and recurrent nets to implement an image captioning system
  • Understand how a trained convolutional network can be used to compute gradients with respect to the input image
  • Implement and different applications of image gradients, including saliency maps, fooling images, class visualizations.
  • Understand and implement style transfer.

Copy Solution from Homework 1

Please copy layers.py and optim.py from your homework 1 solution to the deeplearning directory. We will provide reference files once the deadline of homework 1 is over.

Setup

Make sure your machine is set up with the assignment dependencies.

[Option 1] Install Anaconda and Required Packages

The preferred approach for installing all the assignment dependencies is to use Anaconda, which is a Python distribution that includes many of the most popular Python packages for science, math, engineering and data analysis. Once you install Anaconda you can run the following command inside the homework directory to install the required packages for this homework:

conda env create -f environment.yml

Once you have all the packages installed, run the following command every time to activate the environment when you work on the homework.

conda activate cs182_hw2

[Option 2] Working on a Virtual Machine

This assignment is provided pre-setup with a VirtualBox image. Installation Instructions:

  1. Follow the instructions here to install VirtualBox if it is not already installed.
  2. Download the VirtualBox image here
  3. Load the VirtualBox image using the instructions here
  4. Start the VM. The username and password are both cs182. Required packages are pre-installed and the cs182_hw2 environment activated by default.
  5. Download the assignment code onto the VM yourself.

FAQ

I get an error "AMD-V is disabled in the BIOS" or "Intel-VT is disabled in the BIOS" or similar

Solution: See this link

The virtual machine won't boot

Solutions:

Download Data

Once you have the starter code, you will need to download the CIFAR-10 dataset. Run the following from the homework 2 directory:

cd deeplearning/datasets
./get_assignment2_data.sh

If you don't have wget installed, you can also try

./get_assignment2_data_curl.sh

Start Jupyter Notebook

After you download data, you should start the IPython notebook server from the homework 2 directory with the following command:

jupyter notebook

If you are unfamiliar with IPython, you should read our IPython tutorial.

Submitting your work:

Once you are done working run the collect_submission.sh script; this will produce a file called assignment2.zip. Upload this file to Gradescope. Note that Gradescope will run an autograder on the files you submit. For some test cases, there is a nonzero (but should be very low) probability that correct implementations may fail due to randomness. If you think your implementation is correct, then you can simply resubmit to rerun the autograder to check whether it really is just a particularly unlucky seed..

Q1: Image Captioning with Recurrent Neural Network (34 points)

The IPython notebook RNN_Captioning.ipynb will introduce you to the implementation of vanilla recurrent neural networks for image captioning. Follow the instructions in the notebook to complete this part.

Q2: Image Captioning with LSTM (25 points)

The IPython notebook LSTM_Captioning.ipynb will introduce you to the implementation of LSTM for image captioning. Follow the instructions in the notebook to complete this part.

Q3: Network Visualization (18 points)

The IPython notebook NetworkVisualization.ipynb will introduce you to various techniques for visualizing neural network internals. Follow the instructions in the notebook to complete this part. We will use PyTorch for this part.

Q4: Style Transfer (23 points)

The IPython notebook StyleTransfer.ipynb will introduce you to image style transfer. Follow the instructions in the notebook to complete this part. We will use PyTorch for this part.

cs182_hw2_student's People

Contributors

young-geng avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.