Git Product home page Git Product logo

attention-center's Introduction

Attention center

This repository contains

  • a TensorFlow Lite model to that can be used to predict the attention center of an image, i.e. the area where the most salient parts of an image lie.
  • a python script that can be used to batch encode images using the attention centers. This can be used with the submodule libjxl in order to create JPEG XL images such that decoding the image will start from the attention center determined by the model.

Using Saliency in progressive JPEG XL images is a blog post from 2021 about progressive JPEG XL images.

Open sourcing the attention center model is a blog post from 2022 about this open source project.

For the training of the model center.tflite has been trained with images from the Common objects in context annotated with saliency from the salicon dataset.

How to use

Make sure you have python and tensorflow installed.

  1. Clone it from GitHub including the submodules
  2. Build libjxl, for example by following the instructions given in the the libjxl repo.
  3. run the encode_with_centers.py script.

Here ${INPUT_IMAGE_DIR} contains images that can be opened with the Python Imaging Library and ${OUTPUT_IMAGE_DIR} will be created if it does not exist and the encoded JPEG_XL images will be placed there.

git clone https://github.com/google/attention-center --recursive --shallow-submodules
cd attention-center/libjxl
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=OFF ..
cmake --build . -- -j$(nproc)
cd ../../
python encode_with_centers.py --lite_model_file=./model/center.tflite \
  --image_dir="${INPUT_IMAGE_DIR}" --output_dir="${OUTPUT_IMAGE_DIR}"

There are the following flags:

  --[no]dry_run: If true, only do a dry run, do not write files.
    (default: 'false')
  --encoder: Location of th encoder binary.
  --image_dir: Name of the directory of input images.
    (default: './libjxl/build/tools/cjxl')
  --lite_model_file: Path to the corresponding TFLite model.
  --new_suffix: File extension of the compressed file.
    (default: 'jxl')
  --output_dir: Name of the directory of the output images.
  --[no]verbose: If true, prints info about the commands executed.
    (default: 'true')

An example for using -- would be

python encode_with_centers.py --lite_model_file=./model/center.tflite   --image_dir=/tmp/images --output_dir=/tmp/out/ -- -distance 1.1

Here we pass the flag --distance 1.1 to cjxl.

The flags and arguments for --center_x, --center_y and --group_order are automatically injected.

Example attention center calculations

Here we show an example image, where the calculated attention center of an image is computed. Running

python encode_with_centers.py --lite_model_file=./model/center.tflite --image_dir=./assets --dry_run

gives the following verbose output:

libjxl/build/tools/cjxl_ng -center_x 522 -center_y 1143 assets/white_eyes_bee.jpg white_eyes_bees.jpg

This tells us that the computed attention center is at pixel coordinates (522, 1143). We mark the attention center with a red dot and compare it with the original image.

original image image with attention center as red dot

Progressive JPEG XL demo in Chrome

Check out how some JPEG XL images encoded with the help of the attention center model look like at different stages when loaded with Chrome: google.github.io/attention-center


Authors: Moritz Firsching and Junfeng He

attention-center's People

Contributors

mo271 avatar vaibhavvikas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.