Git Product home page Git Product logo

maptile-ai's Introduction

Super-resolution for different zoom levels in maps

Final project for the Building AI course

Summary

In online maps like Google Maps and OpenStreetMap one can view the world at different zoom levels. But what if you are already looking at the highest available zoom level of the area or you are not satisfied with the reality? Could AI generate super-resolution versions of the map?

Background

Image-to-image translation means learning the mapping from an input image to a target image. Choosing between existing algorithms and developing a new approach will require further research. However, here we present two existing image-to-image translation concepts to give an idea of the method.

Super-resolution for images using deep learning, GANs, or other AI methods is fairly recent technology. In implementations a low-resolution image is upscaled upto 16 times (4 in each direction), and the added pixels are filled with data generated by AI. The AI has learnt the pixels to use from analysing perhaps millions of images in high and low resolution.

Another possible technique is so-called image inpainting (see e.g., https://www.nvidia.com/research/inpainting/ and https://www.theconvolvedblog.vision/#inpaint). In it, damaged or missing parts of an image are reconstructed based on the remaining image. Typically inpainting is used to remove scratches or small objects from photos, but it could be adapted for other use cases as well.

Some very interesting image-to-image translation projects are pix2pix, CycleGAN, and Contrastive Unpaired Translation. Especially the description of pix2pix suggests that it could be at least tested for the problem at hand.

In online maps like Google Maps and OpenStreetMap one can view the world at different zoom levels. Low zoom levels show few details, like country names and borders.

Map view from Turku, Finland, with low-level zoom (from OpenStreetMap): Low-level zoom

Zooming in, more details, such as cities, rivers, and streets, become visible. In cities maps typically show all streets, alleys, buildings, and even some statues and memorials.

Whereas some features shown on the map are present and shown in almost all zoom levels, some of them are only shown in suitable (usually high) zoom levels. For example seas and rivers are visible in more zoom levels than streets. Therefore, zooming in and out of a map is not simply a question of upscaling and downscaling the image. Instead, the map viewer actively shows, hides and arranges items in the view.

Map view from Turku, Finland, with high-level zoom (from OpenStreetMap): High-level zoom

Extract from the low-level zoomed image produced by bigjpg.com: Extract from the low-level zoomed image produced by bigjpg.com

But what if you are already looking at the highest available zoom level of the area or you are not satisfied with the reality? Could AI generate super-resolution versions of the map?

The uses for AI-generated maps are two-fold. First, in limited cases the generated map might match the reality rather well. This could happen when the environment is uniform, like a forest containing more forest, or all blocks in a city being alike.

Second, the AI-generated maps could be a source of inspiration or amusement. Perhaps one would wonder what kind of park or open space would be between streets, if the place were (and the AI system were trained with data from) somewhere else on Earth. The built environment is quite different in, say, Asia, Middle East, Europe and different parts of the Americas.

How is it used?

The most natural way to use the solution would be to have the output used as a map layer in a map application. The user could pan the map and focus on the areas that interest him or her. Ideally the solution could be used on any zoom level and anywhere on Earth, but for performance reasons the map tiles might have to be generated in advance and cover only some regions or zoom levels.

Data sources and AI methods

The main source of data is expected to be OpenStreetMap, since the Open Data Commons Open Database License allows liberal use of their data. One can download the map data for the whole world or for a smaller area, and generate practically an unlimited amount of training data (rendered map tiles) at different zoom levels.

The actual AI methods will be determined after studying existing image-to-image translation solutions. The two solutions discussed earlier (super-resolution and inpainting) do not match exactly with the present problem. However, studying them will provide useful insight into the necessary algorithms.

Technically one could train the AI using the underlying features (knowledge of the roads, buildings, and their coordinates) instead of working on the rendered output (the image we see). However, that should be separate endeavour with its own challenges and solutions.

Challenges

Bigger challenges are expected concerning two aspects in this project: tile borders and text. When smaller map tiles are joined together, it is possible that the AI-generated content doesn't line up nicely. Perhaps they don't even match at all, and one side shows the beginnings of a building while the other side is open space.

It will be interesting to see how text should be handled. When rendering the training material it is of course possible to remove all text (e.g., street names and numbers), but should they be included, and if so, will the AI learn to keep the text intact, even if possibly moved elsewhere in the tile?

There are no ethical issues expected from this project.

What next?

This project needs a great deal of experimentation. With some experience and computing resources and time, it should be possible to see how well the current general purpose image-to-image translation code bases perform in this project. After that is done one can plan further improvements.

Acknowledgments

maptile-ai's People

Contributors

mvinni avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.