Git Product home page Git Product logo

geosynth's Introduction

GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis

This repository is the official implementation of GeoSynth [CVPRW, EarthVision, 2024]. GeoSynth is a suite of models for synthesizing satellite images with global style and image-driven layout control.

Models available in ๐Ÿค— HuggingFace diffusers:

GeoSynth: Hugging Face Model

GeoSynth-OSM: Hugging Face Model

GeoSynth-SAM: Hugging Face Model

GeoSynth-Canny: Hugging Face Model

All model ckpt files available here - Model Zoo

โญ๏ธ Next

  • Update Gradio demo
  • Release Location-Aware GeoSynth Models to ๐Ÿค— HuggingFace
  • Release PyTorch ckpt files for all models
  • Release GeoSynth Models to ๐Ÿค— HuggingFace

๐ŸŒ Inference

Example inference using ๐Ÿค— HuggingFace pipeline:

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
import torch
from PIL import Image

img = Image.open("osm_tile_18_42048_101323.jpeg")

controlnet = ControlNetModel.from_pretrained("MVRL/GeoSynth-OSM")

pipe = StableDiffusionControlNetPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base", controlnet=controlnet)
pipe = pipe.to("cuda:0")

# generate image
generator = torch.manual_seed(10345340)
image = pipe(
    "Satellite image features a city neighborhood",
    generator=generator,
    image=img,
).images[0]

image.save("generated_city.jpg")

๐Ÿ“ Geo-Awareness

Our model is able to synthesize based on high-level geography of a region:

๐Ÿง‘โ€๐Ÿ’ป Setup and Training

Style for OSM imagery is created using MapBox. The style file can be downloaded from here. The dataset can be downloaded from here. Look at train.md for details on setting up the environment and training models on your own data.

๐Ÿจ Model Zoo

Download GeoSynth models from the given links below:

Control Location Download Url
- โŒ Link
OSM โŒ Link
SAM โŒ Link
Canny โŒ Link
- โœ… Link
OSM โœ… Link
SAM โœ… Link
Canny โœ… Link

๐Ÿ“‘ Citation

@inproceedings{sastry2024geosynth,
  title={GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis},
  author={Sastry, Srikumar and Khanal, Subash and Dhakal, Aayush and Jacobs, Nathan},
  booktitle={IEEE/ISPRS Workshop: Large Scale Computer Vision for Remote Sensing (EARTHVISION),
  year={2024}
}

๐Ÿ” Additional Links

Check out our lab website for other interesting works on geospatial understanding and mapping:

  • Multi-Modal Vision Research Lab (MVRL) - Link
  • Related Works from MVRL - Link

geosynth's People

Contributors

vishu26 avatar

Stargazers

Vinh Pham avatar Qian Ming avatar Nathan Jacobs avatar Adeel Ahmad avatar Yuru Jia avatar Miquel Espinosa avatar kdh8219 avatar  avatar Roman Deev avatar Piet Brรถmmel avatar  avatar Anto Subash avatar Manticore avatar rayk avatar Athena Psalta avatar Matthew Whittle avatar Marc G avatar Hendrik  avatar Jacques Tardie avatar Akram Zaytar avatar  avatar Alex Quistberg avatar Tarashish Mishra avatar Tian Pi avatar Eephone Xu avatar Chris Mihiar avatar Arnis Kadakovskis avatar Najah Pokkiri avatar Jan Paral avatar  avatar Seyed Ali Ahmadi avatar Janosch Woschitz avatar  avatar  avatar Phil Dias avatar ringsaturn avatar Taher Chegini avatar Konstantin Klemmer avatar Omid Ghozatlou avatar A. avatar  avatar Kang Wu avatar ๆŽๅผ€ๅฎ‡ avatar Boan Chen avatar Aayush Dhakal avatar Danee avatar Pratinav Seth avatar  avatar PanYang avatar Robin Cole avatar Yuanhong Yu avatar

Watchers

Nathan Jacobs avatar Robin Cole avatar Kostas Georgiou avatar  avatar

geosynth's Issues

Does the model working?

Hi ! This looks super nice but I can't succeed to use it.

I download https://huggingface.co/MVRL/GeoSynth/blob/main/sd-base-geosynth.ckpt

Impossible to load it with A1111:

changing setting sd_model_checkpoint to geosynth_sd_loc-v3.ckpt [6223d68433]: AttributeError
Traceback (most recent call last):
  File "/Users/tmp/code/stable-diffusion-webui/modules/options.py", line 165, in set
    option.onchange()
  File "/Users/tmp/code/stable-diffusion-webui/modules/call_queue.py", line 13, in f
    res = func(*args, **kwargs)
  File "/Users/tmp/code/stable-diffusion-webui/modules/initialize_util.py", line 181, in <lambda>
    shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
  File "/Users/tmp/code/stable-diffusion-webui/modules/sd_models.py", line 869, in reload_model_weights
    state_dict = get_checkpoint_state_dict(checkpoint_info, timer)
  File "/Users/tmp/code/stable-diffusion-webui/modules/sd_models.py", line 330, in get_checkpoint_state_dict
    res = read_state_dict(checkpoint_info.filename)
  File "/Users/tmp/code/stable-diffusion-webui/modules/sd_models.py", line 314, in read_state_dict
    sd = get_state_dict_from_checkpoint(pl_sd)
  File "/Users/tmp/code/stable-diffusion-webui/modules/sd_models.py", line 253, in get_state_dict_from_checkpoint
    pl_sd = pl_sd.pop("state_dict", pl_sd)
AttributeError: 'NoneType' object has no attribute 'pop'

So I convert it first in .safetensors. And it succeeds to load it.

Then I try a very simple text2image test with only this prompt Satellite image features a city neighborhood and default parameters.
I get this kind of error

      File "/Users/famille/Documents/_Seb/Bidouilles/code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
        return F.linear(input, self.weight, self.bias)
    RuntimeError: linear(): input and weight.T shapes cannot be multiplied (77x1024 and 768x320)

How to download the OSM data?

Hello, thanks for your excellent work! Could you share how to download the OSM data and its corresponding RS image๏ผŸ

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.