Git Product home page Git Product logo

pytorch-example's Introduction

Docker

Building the docker container

NOTE: This step is not necessary if you simply want to use an already published image to run the example code on the UA HPC.

docker build -f Dockerfile -t uazhlt/pytorch-example .

Verify PyTorch version

docker run --rm -it uazhlt/pytorch-example python -c "import torch; print(torch.__version__)"

Publish to DockerHub

NOTE: This step is not necessary if you simply want to use an already published image to run the example code on the UA HPC.

# login to dockerhub registry
docker login --username=yourdockerhubusername [email protected]

docker push org/image-name:taghere

Singularity

Building a Singularity image

Building a Singularity image from a def file requires sudo on a Linux system. In this tutorial, we avoid discussing details on installing Singularity. If you're feeling adventurous, take a look at the example def file in this repository and the official documentation:

Alternatives

Cloud builds

VMs

Docker -> Singularity

Retrieving a published Singularity image

Instead of building from scratch, we'll focus on a shortcut that simply wraps docker images published to DockerHub.

singularity pull uazhlt-pytorch-example.sif docker://uazhlt/pytorch-example:latest

HPC

If you intend to test out the PyTorch example included here, you'll want to clone this repository:

git clone https://github.com/ua-hlt-program/pytorch-example.git

Running Singularity in an interactive PBS job

Next, we'll request an interactive job (tested on El Gato):

qsub -I \
-N interactive-gpu \
-W group_list=mygroupnamehere \
-q standard \
-l select=1:ncpus=2:mem=16gb:ngpus=1 \
-l cput=3:0:0 \
-l walltime=1:0:0

_NOTE: If you're unfamiliar with qsub and the many options in the command above seem puzzling, you can find answers by checking out the manual via man qsub _

If the cluster isn't too busy, you should soon see a new prompt formatted something like [netid@gpu\d\d ~].

Now we'll run the singularity image we grabbed earlier. Before that, though, let's ensure we're using the correct version of Singularity and that the correct CUDA version is available to Singularity:

module load singularity/3.2.1
module load cuda10/10.1

Now we're finally ready to run the container:

singularity shell --nv --no-home /path/to/your/uazhlt-pytorch-example.sif

If you ran into an error, check to see if you replaced /path/to/your/ with the correct path to uazhlt-pytorch-example.sif before executing the command.

We're now in our Singularity container! If everything went well, we should be able to see the gpu:

nvidia-smi

You should see output like the following:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.01    Driver Version: 418.87.01    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K20Xm         On   | 00000000:8B:00.0 Off |                    0 |
| N/A   17C    P8    18W / 235W |      0MiB /  5700MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Success (I hope)! Now let's try running PyTorch on the GPU with batching...

PyTorch example

The Pytorch example code can be found under example. The data used in this example comes from from Delip Rao and Brian MacMahan's Natural Language Processing with PyTorch:

The dataset relates surnames to nationalities. Our version (minor modifications) is nested under examples/data.

train.py houses a command line program for training a classifier. The following invocation will display the tool's help text:

python train.py --help

The simple model architecture operates is based on that of deep averaging networks (DANs; see https://aclweb.org/anthology/P15-1162/).

Reading through train.py you can quickly see how the code is organized. Some parts (ex. torchtext data loaders) may be unfamiliar to you.

Next steps

Now that you've managed to run some example PyTorch code, there are many paths forward:

  • Experiment with using pretrained subword embeddings (both fixed and trainable). Do you notice any improvements in performance/faster convergence?
  • Try improving or replacing the naive model defined under models.py.
  • Add an evaluation script for a trained model that reports macro P, R, and F1. Feel free to use scikit-learn's classification report.
  • Add an inference script to classify new examples.
  • Monitor validation loss to and stop training if you begin to overfit.
  • Adapt the interactive PBS task outlined above to a PBS script that you can submit to the HPC.
  • Address the class imbalance in the data through downsampling, class weighting, or another technique of your choosing.

pytorch-example's People

Contributors

myedibleenso avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

mohmdsh

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.