Git Product home page Git Product logo

neptune-ai / open-solution-mapping-challenge Goto Github PK

View Code? Open in Web Editor NEW
374.0 24.0 93.0 721 KB

Open solution to the Mapping Challenge :earth_americas:

Home Page: https://www.crowdai.org/challenges/mapping-challenge

License: MIT License

Python 55.84% Jupyter Notebook 44.10% Makefile 0.07%
data-science machine-learning deep-learning kaggle python satellite-imagery data-science-learning lightgbm unet unet-image-segmentation

open-solution-mapping-challenge's Introduction

Open Solution to the Mapping Challenge Competition

Gitter license

Note

Unfortunately, we can no longer provide support for this repo. Hopefully, it should still work, but if it doesn't, we cannot really help.

More competitions ๐ŸŽ‡

Check collection of public projects ๐ŸŽ, where you can find multiple Kaggle competitions with code, experiments and outputs.

Poster ๐ŸŒ

Poster that summarizes our project is available here.

Intro

Open solution to the CrowdAI Mapping Challenge competition.

  1. Check live preview of our work on public projects page: Mapping Challenge ๐Ÿ“ˆ.
  2. Source code and issues are publicly available.

Results

0.943 Average Precision ๐Ÿš€

0.954 Average Recall ๐Ÿš€

No cherry-picking here, I promise ๐Ÿ˜‰. The results exceded our expectations. The output from the network is so good that not a lot of morphological shenanigans is needed. Happy days:)

Average Precision and Average Recall were calculated on stage 1 data using pycocotools. Check this blog post for average precision explanation.

Disclaimer

In this open source solution you will find references to the neptune.ai. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ai is not necessary to proceed with this solution. You may run it as plain Python script ๐Ÿ˜‰.

Reproduce it!

Check REPRODUCE_RESULTS

Solution write-up

Pipeline diagram

Preprocessing

โœ”๏ธ What Worked

  • Overlay binary masks for each image is produced (code ๐Ÿ’ป).
  • Distances to the two closest objects are calculated creating the distance map that is used for weighing (code ๐Ÿ’ป).
  • Size masks for each image is produced (code ๐Ÿ’ป).
  • Dropped small masks on the edges (code ๐Ÿ’ป).
  • We load training and validation data in batches: using torch.utils.data.Dataset and torch.utils.data.DataLoader makes it easy and clean (code ๐Ÿ’ป).
  • Only some basic augmentations (due to speed constraints) from the imgaug package are applied to images (code ๐Ÿ’ป).
  • Image is resized before feeding it to the network. Surprisingly this worked better than cropping (code ๐Ÿ’ป and config ๐Ÿ“‘).

โœ–๏ธ What didn't Work

  • Ground truth masks are prepared by first eroding them per mask creating non overlapping masks and only after that the distances are calculated (code ๐Ÿ’ป).
  • Dilated small objects to increase the signal (code ๐Ÿ’ป).
  • Network is fed with random crops (code ๐Ÿ’ป and config ๐Ÿ“‘).

๐Ÿค” What could have worked but we haven't tried it

Network

โœ”๏ธ What Worked

  • Unet with Resnet34, Resnet101 and Resnet152 as an encoder where Resnet101 gave us the best results. This approach is explained in the TernausNetV2 paper (our code ๐Ÿ’ป and config ๐Ÿ“‘). Also take a look at our parametrizable implementation of the U-Net.

โœ–๏ธ What didn't Work

  • Network architecture based on dilated convolutions described in this paper.

๐Ÿค” What could have worked but we haven't tried it

  • Unet with contextual blocks explained in this paper.

Loss function

โœ”๏ธ What Worked

  • Distance weighted cross entropy explained in the famous U-Net paper (our code ๐Ÿ’ป and config ๐Ÿ“‘).
  • Using linear combination of soft dice and distance weighted cross entropy (code ๐Ÿ’ป and config ๐Ÿ“‘).
  • Adding component weighted by building size (smaller buildings has greater weight) to the weighted cross entropy that penalizes misclassification on pixels belonging to the small objects (code ๐Ÿ’ป).

Weights visualization

For both weights: the darker the color the higher value.

  • distance weights: high values corresponds to pixels between buildings.
  • size weights: high values denotes small buildings (the smaller the building the darker the color). Note that no-building is fixed to black.

Training

โœ”๏ธ What Worked

  • Use pretrained models!
  • Our multistage training procedure:
    1. train on a 50000 examples subset of the dataset with lr=0.0001 and dice_weight=0.5
    2. train on a full dataset with lr=0.0001 and dice_weight=0.5
    3. train with smaller lr=0.00001 and dice_weight=0.5
    4. increase dice weight to dice_weight=5.0 to make results smoother
  • Multi-GPU training
  • Use very simple augmentations

The entire configuration can be tweaked from the config file ๐Ÿ“‘.

๐Ÿค” What could have worked but we haven't tried it

  • Set different learning rates to different layers.
  • Use cyclic optimizers.
  • Use warm start optimizers.

Postprocessing

โœ”๏ธ What Worked

  • Test time augmentation (tta). Make predictions on image rotations (90-180-270 degrees) and flips (up-down, left-right) and take geometric mean on the predictions (code ๐Ÿ’ป and config ๐Ÿ“‘).
  • Simple morphological operations. At the beginning we used erosion followed by labeling and per label dilation with structure elements chosed by cross-validation. As the models got better, erosion was removed and very small dilation was the only one showing improvements (code ๐Ÿ’ป).
  • Scoring objects. In the beginning we simply used score 1.0 for every object which was a huge mistake. Changing that to average probability over the object region improved results. What improved scores even more was weighing those probabilities with the object size (code ๐Ÿ’ป).
  • Second level model. We tried Light-GBM and Random Forest trained on U-Net outputs and features calculated during postprocessing.

โœ–๏ธ What didn't Work

  • Test time augmentations by using colors (config ๐Ÿ“‘).
  • Inference on reflection-padded images was not a way to go. What worked better (but not for the very best models) was replication padding where border pixel value was replicated for all the padded regions (code ๐Ÿ’ป).
  • Conditional Random Fields. It was so slow that we didn't check it for the best models (code ๐Ÿ’ป).

๐Ÿค” What could have worked but we haven't tried it

  • Ensembling
  • Recurrent neural networks for postprocessing (instead of our current approach)

Model Weights

Model weights for the winning solution are available here

You can use those weights and run the pipeline as explained in REPRODUCE_RESULTS.

User support

There are several ways to seek help:

  1. crowdai discussion.
  2. You can submit an issue directly in this repo.
  3. Join us on Gitter.

Contributing

  1. Check CONTRIBUTING for more information.
  2. Check issues to check if there is something you would like to contribute to.

open-solution-mapping-challenge's People

Contributors

apyskir avatar gitter-badger avatar jakubczakon avatar kamil-kaczmarek avatar kant avatar spmohanty avatar taraspiotr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-solution-mapping-challenge's Issues

improve model saving

  • Add epoch number to the saved model name, that is model.torch.
  • Save model i.e. model_epoch123.model
  • remove previously saved model (with smaller epoch number).

prepare metadata and masks

General requirements:

  • generate metadata file
  • in main.py option to prepare meta and masks (for train)

For the training purposes:

  • prepare masks overlayed

Note, that masks for evaluation will be prepared on-the-fly in the loader from single json.

Models transform method that returns generators

Right now each steps.pytorch.Model instance returns a dictionary of lists of outputs from the network. It causes problems when working with larger datasets. I think it should return a generator instead

train models on small buildings only

  • take only masks with objects smaller than 32^2
  • check if we can make reasonable result on small buildings only
  • if successful train small-buildings-specialist

Investigate if the validation set has any rotated images

Using random rotation in augmentation may cause trouble because:

  • if pictures are taken at the same time of the day (presumable of the short period of the year) shadow will be more or less the same
  • orientation of buildings/roads may be more or less the same throughout the dataset

Actions:

  • go through train/valid/test examples and check if constant angle hypothesis is correct
  • drop rotation from augmentations or use only a very small rotation

evaluation in chunks

Evaluation is currently not possible on the entire validation set.
There is however option to generate predictions in chunks.
Similar option in evaluation would help

Prepare masks for multiclass case

Currently function overlay_masks_from_annotations in preparation.py assumes single class case. It has to be modified and adapted to multiclass case. Some tuning of loaders may be necessary.

Experiment with loss parameters

Loss params like:

  • BCE weight
  • DICE weight
  • distance weighted loss params
  • size weighted loss params
  • weight schedule

need to be investigated

Calculate normalization constants step

  • Currently normalization constants mean=0 std=1 are hard-coded, they should be calculated on train and passed to loaders
  • Substitute mean and std with values from pretrained PyTorch models (i.e. ResNet).

Mean and std different than pretrained pytorch models

It seems that in the pipeline_config.py:

MEAN = [0., 0., 0.]
STD = [1., 1., 1.]

But the pytorch pretrained models have:

transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]))

I think this may cause suboptimal results

Adapt creating submission for multiclass case

Current submission generator assumes that predictions is a list of single-channel images (one image per test image), where pixels with values 1,2,3... correspond to building instances 1,2,3...

New submission generator has to be adjusted to handle output from MulticlassLabeler (that will be prepared in issue #23

Use HDF5 format to store images

One of the options to speed-up data loaders is to:

  • transform your images and target masks into one large HDF5 file
  • refactor/add pytorch Datasets that read from the HDF5 file

weighted cross entropy loss function

Build training that implements this:

  • erode masks (so all masks were separated, they didnโ€™t touch),
  • train U-Net with weighted cross entropy loss function. Idea described in this kaggle post. While training add more weight in loss function to points that we would assign to contours touching category,
  • make predictions,
  • dilate predictions.

Weights calculated as in the U-Net paper (page 5, eq. 2).

crop central part of a UNet prediction

U-Net performs poorly on edges. Effect described in this blog post.

Idea is to crop central part of the image during predicting. Of course it must be followed by sliding-window predictions with reflection padding of the original image.

Mosaic padding

Implement mosaic-padding-based inference to tackle the troublesome edge regions in unets

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.