Git Product home page Git Product logo

ai-challenge's Introduction

Perceptual Image Enhancement on Smartphones


This repository provides a guideline and code to convert your pre-trained models into an appropriate submission format for AI Challenge organized in conjunction with ECCV 2018 Conference.

1. Prerequisites

2. General model requirements

  • Your model should be able to process images of arbitrary resolution
  • It should require no more than 3.5GB of RAM while processing HD-resolution [1280x720px] photos
  • Maximum model size: 100MB
  • Should be saved as Tensorflow .pb file

3. Model conversion and validation

Your model should be implemented as a function that takes one single input and produces one output:

  • Input:  Tensorflow 4-dimensional tensor of shape [batch_size, image_height, image_width, 3]
    In the Super-Resolution task, input images are already bicubically interpolated (x4) and have the same size as target high-resolution photos
  • Output:  Same as input
  • Values:  The values of both original and processed images should lie in the interval [0, 1]

Here is a valid SRCNN function provided for you as a reference:

def srcnn(images):

    with tf.variable_scope("generator"):

        weights = {
          'w1': tf.Variable(tf.random_normal([9, 9, 3, 64], stddev=1e-3), name='w1'),
          'w2': tf.Variable(tf.random_normal([5, 5, 64, 32], stddev=1e-3), name='w2'),
          'w3': tf.Variable(tf.random_normal([5, 5, 32, 3], stddev=1e-3), name='w3')
        }

        biases = {
          'b1': tf.Variable(tf.zeros([64]), name='b1'),
          'b2': tf.Variable(tf.zeros([32]), name='b2'),
          'b3': tf.Variable(tf.zeros([1]), name='b3')
        }

        conv1 = tf.nn.relu(tf.nn.conv2d(images, weights['w1'], strides=[1,1,1,1], padding='SAME') + biases['b1'])
        conv2 = tf.nn.relu(tf.nn.conv2d(conv1, weights['w2'], strides=[1,1,1,1], padding='SAME') + biases['b2'])
        conv3 = tf.nn.conv2d(conv2, weights['w3'], strides=[1,1,1,1], padding='SAME') + biases['b3']

    return tf.nn.tanh(conv3) * 0.58 + 0.5

To test and convert your pre-trained models, run the following scripts:

  • Track A, Image Super-Resolution:  evaluate_super_resolution.py
  • Track B, Image Enhancement:  evaluate_enhancement.py

You need to modify two lines in the headers of the above scripts:

from <model_file> import <your_model> as test_model
model_location = "path/to/your/saved/pre-trained/model"

Here, model_file.py should be a python file containing your model definition, your_model is the actual function that defines your model, and model_location points to your saved pre-trained model file.

After running these scripts, they will:

  1. save your model as model.pb file stored in models_pretrained/ folder
  2. compute PSNR/SSIM scores on a subset of validation images/patches
  3. compute running time and estimated RAM consumption for HD-resolution images

4. Provided pre-trained models

Apart from the validation scripts, we also provide you several pre-trained models that can be restored and validated using the same scripts. In all cases, model architectures are defined in the models.py file.

Super-resolution task:

  1. SRCNN, function: srcnn, pre-trained model: models_pretrained/div2k_srcnn
  2. ResNet with one residual block, function: resnet_6_16, pre-trained model: models_pretrained/div2k_resnet_6_16
  3. VGG-19, function: vgg_19, pre-trained model: models_pretrained/div2k_vgg19_vdsr.ckpt

Image Enhancement task:

  1. SRCNN, function: srcnn, pre-trained model: models_pretrained/dped_srcnn
  2. ResNet with 4 residual blocks, function: resnet_12_64, pre-trained model: models_pretrained/dped_resnet_12_64
  3. ResNet with 2 residual blocks, function: resnet_8_32, pre-trained model: models_pretrained/dped_resnet_8_32

5. Team registration and model submission

To register your team, send an email to [email protected] with the following information:

Email Subject:  AI Mobile Challenge Registration

Email Text:     Team Name
                Team Member 1 (Name, Surname, Affiliation)
                Team Member 2 (Name, Surname, Affiliation)
                ....

To validate your model, send an email indicating the track, team id and the corresponding model.pb file:

Email Subject:  [Track X] [Team ID] [Team Name] Submission

Email Text:     Link to model.pb file

You are allowed to send up to 2 submissions per day for each track. The leaderboard will show the results of your last successful submission. Please make sure that the results provided by our validation scripts are meaningful before sending your submission files.

6. Scoring formulas

The performance of your solution will be assessed based on three metrics: its speed compared to a baseline network, its fidelity score measured by PSNR, and its perceptual score computed based on MS-SSIM metric. Since PSNR and SSIM scores do not always objectively reflect image quality, during the test phase we will conduct a user study where your final submissions will be evaluated by a large number of people, and the resulting MOS Scores will replace MS-SSIM results. The total score of your solution will be calculated as a weighted average of the previous scores:

TotalScore = α * (PSNR_solution - PSNR_baseline) + β * (SSIM_solution - SSIM_baseline) + γ * min(Time_baseline / Time_solution, 4) 

We will use three different validation tracks for evaluating your results. Score A is giving preference to solution with the highest fidelity (PSNR) score, score B is aimed at the solution providing the best visual results (MS-SSIM/MOS scores), and score C is targeted at the best balance between the speed and perceptual/quantitative performance. For each track, we will use the above scoring formula but with different coefficients:

Track A (Super-Resolution):

  • PSNR_baseline = 26.5, SSIM_baseline = 0.94
  • (α, β, γ):   score A - (4, 100, 1);   score B - (1, 400, 1);   score C - (2, 200, 1.5)

Track B (Image Enhancement):

  • PSNR_baseline = 21, SSIM_baseline = 0.9
  • (α, β, γ):   score A - (4, 100, 2);   score B - (1, 400, 2);   score C - (2, 200, 2.9)

7. Other remarks

  • Note that the provided code is used only for preliminary model validation, while all final numbers will be obtained by us by testing all submissions on the test parts of the datasets (accuracy) and on the same hardware (speed)

  • To check the above RAM requirements, we will run your submissions on a GPU with 3.5GB of RAM.
    In case this won't be enough for your model, it will be disqualified from the final validation stage

ai-challenge's People

Contributors

aiff22 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.