azavea / raster-vision-fastai-plugin Goto Github PK

View Code? Open in Web Editor NEW

8.0 8.0 6.0 171 KB

PyTorch/fastai backend plugin for Raster Vision

License: Other

Dockerfile 2.51% Shell 4.04% Python 93.45%

pytorch fastai

raster-vision-fastai-plugin's Introduction

Raster Vision PyTorch/fastai Plugin

This plugin uses PyTorch and fastai to implement a semantic segmentation backend plugin for Raster Vision.

⚠️ This repo is deprecated, as Raster Vision 0.10 has built-in PyTorch backends. However, it still may be useful as an example of how to construct a backend plugin.

⚠️ The object detection backend is in a partially completed state and does not work.

Setup and Requirements

Docker

You'll need docker (preferably version 18 or above) installed. After cloning this repo, to build the Docker images, run the following command:

> docker/build

Before running the container, set an environment variable to a local directory in which to store data.

> export RASTER_VISION_DATA_DIR="/path/to/data"

To run a Bash console in the Docker container, invoke:

> docker/run

This will mount the following local directories to directories inside the container:

$RASTER_VISION_DATA_DIR -> /opt/data/
fastai_plugin/ -> /opt/src/fastai_plugin/
examples/ -> /opt/src/examples/
scripts/ -> /opt/src/scripts/

This script also has options for forwarding AWS credentials (--aws), running Jupyter notebooks (--jupyter), running on a GPU (--gpu), and others. Run docker/run --help for more details.

Debug Mode

For debugging, it can be helpful to use a local copy of the Raster Vision source code rather than the version baked into the Docker image. To do this, you can set the RASTER_VISION_REPO environment variable to the location of the main repo on your local filesystem. If this is set, docker/run will mount $RASTER_VISION_REPO/rastervision to /opt/src/rastervision inside the container. You can then modify your local copy of Raster Vision in order to debug experiments running inside the container.

(Optional) Setup AWS Batch

This assumes that a Batch stack was created using the Raster Vision AWS Batch setup. To use this plugin, you will need to add a job definition which points to a new tag on the ECR repo, and then publish the image to that tag. You can do this by editing scripts/cpu_job_def.json, [scripts/gpu_job_def.json](scripts/gpu_job_def.json], and [docker/publish_image], and then running docker/publish_image outside the container, and scripts/add_job_defs inside the container.

Setup profile

Using the plugin requires making a Raster Vision profile which points to the location of the plugin module. You can make such a profile by creating a file at ~/.rastervision/fastai containing something like the following. If using Batch, the AWS_BATCH section should point to the resources created above.

[AWS_BATCH]
job_queue=lewfishRasterVisionGpuJobQueue
job_definition=lewfishFastaiPluginGpuJobDef
cpu_job_queue=lewfishRasterVisionCpuJobQueue
cpu_job_definition=lewfishFastaiPluginCpuJobDef
attempts=5

[AWS_S3]
requester_pays=False

[PLUGINS]
files=[]
modules=["fastai_plugin.semantic_segmentation_backend_config"]

Running an experiment

To test the plugin, you can run an experiment using the ISPRS Potsdam dataset. Info on setting up the data and experiments in general can be found in the examples repo. A test run can be executed locally using something like the following. The -p fastai flag says to use the fastai profile created above.

export RAW_URI="/opt/data/raw-data/isprs-potsdam"
export PROCESSED_URI="/opt/data/fastai/potsdam/processed-data"
export ROOT_URI="/opt/data/fastai/potsdam/local-output"
rastervision -p fastai run local -e examples.semantic_segmentation.potsdam -m *exp_resnet18* \
    -a raw_uri $RAW_URI -a processed_uri $PROCESSED_URI -a root_uri $ROOT_URI \
    -a test True --splits 2

A full experiment can be run on AWS Batch using something like:

export RAW_URI="s3://raster-vision-raw-data/isprs-potsdam"
export PROCESSED_URI="s3://raster-vision-lf-dev/fastai/potsdam/processed-data"
export ROOT_URI="s3://raster-vision-lf-dev/fastai/potsdam/remote-output"
rastervision -p fastai run aws_batch -e examples.semantic_segmentation.potsdam -m *exp_resnet18* \
    -a raw_uri $RAW_URI -a processed_uri $PROCESSED_URI -a root_uri $ROOT_URI \
    -a test False --splits 4

This gets to an average F1 score of 0.87 after 15 minutes of training.

raster-vision-fastai-plugin's People

Contributors

Stargazers

Watchers

Forkers

ml-lab adeelh anlianglu rabscuttler yoninachmany sheecegardezi

raster-vision-fastai-plugin's Issues

Add option for using subset of training chips

See https://docs.fast.ai/data_block.html#ItemList.use_partial_data and https://github.com/azavea/raster-vision-fastai-plugin/blob/master/fastai_plugin/semantic_segmentation_backend.py#L179

Add chip classification backend

Before starting this, I recommend getting the semantic segmentation plugin running locally and remotely using the Potsdam example. The chip_classification_backend_config.py file should have very similar content to the semantic segmentation version -- the options shouldn't be any different at this point. You will need to add a chip classification example to this repo that uses the fastai plugin. You could adapt https://github.com/azavea/raster-vision-examples#spacenet-rio-building-chip-classification.

In the backend itself, there will be a few differences. The semantic segmentation backend stores the data as a directory of images and a directory of corresponding label images. The chip classification backend will store images in a set of folders, one for each class. Note that we are now creating debug chips in the train method. Instead of using SegmentationItemList, you will use ImageList. See https://docs.fast.ai/data_block.html for more info on creating the DataBunch. For the metrics, you should let the clas_idx default to -1.

In PyTorch, images are stored (num_channels, height, width) as opposed to RV and Tensorflow which is (height, width, num_channels).

Handle non-uint8 pixel values

Many GeoTIFFs contain pixel values that are not uint8 values (they may be uint16 or floats). This is problematic because fastai and most other libraries expect images to be in the form of PNG or JPG files, which can only have 3 bands, and store values as uint8. The way we have gotten around this limitation is to transform the underlying raster values to uint8 using statistics generated by the RV analyze command. We would like to be able to skip this step and directly train models (and make predictions) on the original values.

To do this, we need to save and load training chips in some format that allows >3 channels and floats. The fastai Image class might support some format that does this already, so you should look into this. It uses Pillow to open files, and one of those formats might work. (See https://docs.fast.ai/vision.image.html#The-Image-classes and https://github.com/fastai/fastai/blob/master/fastai/vision/image.py#L393).

In the likely event that one of the builtin formats won't work, you will need to save and load numpy arrays. There are routines built into numpy to save and load arrays and I think there is a compression option which should speed things up, but I'm not sure about that. I don't know the best way to get fastai to load numpy arrays. I think you want to create a custom NumpyImage class that derives from fastai's Image and a custom ItemList, but I'm not sure. See https://docs.fast.ai/tutorial.itemlist.html

If you're stuck, the Fastai forums should be helpful: https://forums.fast.ai/ . Also, the Visual Studio Code Python Remote Debugger is likely to come in handy.

You should get this to work in the fastai semantic segmentation plugin first. If it works, we can talk about generalizing it to the other plugins later. This should be tested using the Vegas Spacenet building simple segmentation example https://github.com/azavea/raster-vision-examples#spacenet-vegas-simple-semantic-segmentation. Note that you will first need to port this example over to using the fastai backend. The current example uses a StatsAnalyzer to convert from uint16 to uint8. You will need to turn this off. To test this, run an experiment and make sure the f1 scores are approximately the same or better than the ones reported in the "Eval Metrics" under https://github.com/azavea/raster-vision-examples#spacenet-vegas-simple-semantic-segmentation

Upgrade GDAL version

When I tried to build the image, I had the same problem we were running into here. I was able to fix it by making essentially the same changes to the Dockerfile that Rob did. I would submit a PR with the changes but I don't seem to have the appropriate access rights to push up a branch.

Possible pytorch / fastAI version mismatch: AttributeError: module 'torch' has no attribute 'gesv'

I got an error running with latest Docker image (full stacktrace):

File "/opt/conda/lib/python3.6/site-packages/fastai/vision/transform.py", line 233, in <module> _solve_func = getattr(torch, 'solve', torch.gesv)

AttributeError: module 'torch' has no attribute 'gesv'

FastAI forums report pytorch / Fastai version issues (e.g see here).

Changing fastai version to 1.0.57 in the docker file, ie
ARG FASTAI_COMMIT=585d107709c9af8d88ddf2e20eb06b4ad7f4f70f
->
ARG FASTAI_COMMIT=1.0.57
worked for me.

I used the plugin to report an error

How to use the model generated by this plug-in to make predictions,

rastervison predict package input.tif output.tif?
This command does not predict success

Fine-tune pre-trained models on >3 band input

Fine-tuning a model pretrained on Imagenet results in a big improvement in accuracy over training from scratch, even on satellite imagery, which looks quite different than Imagenet. Unfortunately, it is not straightforward to use an Imagenet pretrained model with images with >3 bands. We would like to explore whether it is possible to take a pretrained model, and then modify it in some way so that we can get the benefits of the extra bands, while maintaining the the performance boost from transfer learning. It's not clear how to do this, or if it will work.

The first idea we'd like to try is to load a pretrained model (UNet with Resnet18) and then modify the first convolutional layer weight array (ie. the kernel) so that it can take images with k channels as input. This array will have more numbers in it to account for the extra channels. It's an open question as to how to create this array. The most obvious idea is to just randomly initialize a new weight matrix of appropriate shape using the xavier_init routine, or whichever smart initialization technique is typically used with ResNets. Then, copy over the values from the pretrained model.

Other half-baked ideas:

You may need to adjust the learning rate to be higher (or lower?) for the new weights that have been added. Or use a lower learning rate overall so that things don't blow up.
You may also need to modify the weights from the pretrained network in some way. One thought is that you may want to scale and shift the weights so that the output of the first convolutional layer has the same statistics as it did before the "model surgery" took place. This is what BatchNormalization layers do -- they shift and scale their input so that the output has mean 0 and std 1. So, if there is a BatchNormalization layer in the right place in the model, this might not be necessary.
The new weights may need to be smaller than the weights that are copied over so that they don't overwhelm the output at first. The idea is that we want the network to gradually adapt to the new input coming into the network.
If none of this works, we might need to resort to a "dual-encoder" approach (what is referred to as
a "late-fusion" approach in [1]. We couldn't get this to work, but maybe there is a better way to do it.

To test this, you should use the Potsdam example (https://github.com/azavea/raster-vision-fastai-plugin/blob/master/examples/semantic_segmentation/potsdam.py
) because it has an infrared channel which we expect to be useful for classifying vegetation, which is a class in the dataset. Using this technique with all four channels (RGB+IR), the performance should be as good, and hopefully better, than just training on IR-R-G (infrared, red, green). You may find it helpful to refer back to our original blog post [1] on the Potsdam dataset.

The dataset is stored at s3://raster-vision-raw-data/isprs-potsdam/.

There is an elevation band available for this dataset, but it is stored in a separate file so might require some extra work to use it with Raster Vision (you may need to concatenate the elevation files to the RGBIR files). Also note that the elevation data provided with the dataset contains errors. At some point we fixed these errors and the resulting files are stored somewhere. Rob should know where they are.

Note: #19 is a prerequisite for this issue.

References:
[1] https://www.azavea.com/blog/2017/05/30/deep-learning-on-aerial-imagery/.