GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis

Srikumar Sastry*, Subash Khanal, Aayush Dhakal, Nathan Jacobs (*Corresponding Author)

This repository is the official implementation of GeoSynth [CVPRW, EarthVision, 2024]. GeoSynth is a suite of models for synthesizing satellite images with global style and image-driven layout control.

Models available in 🤗 HuggingFace diffusers:

GeoSynth:

GeoSynth-OSM:

GeoSynth-SAM:

GeoSynth-Canny:

All model ckpt files available here - Model Zoo

⏭️ Next

Update Gradio demo
Release Location-Aware GeoSynth Models to 🤗 HuggingFace
Release PyTorch ckpt files for all models
Release GeoSynth Models to 🤗 HuggingFace

🌏 Inference

Example inference using 🤗 HuggingFace pipeline:

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
import torch
from PIL import Image

img = Image.open("osm_tile_18_42048_101323.jpeg")

controlnet = ControlNetModel.from_pretrained("MVRL/GeoSynth-OSM")

pipe = StableDiffusionControlNetPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base", controlnet=controlnet)
pipe = pipe.to("cuda:0")

# generate image
generator = torch.manual_seed(10345340)
image = pipe(
    "Satellite image features a city neighborhood",
    generator=generator,
    image=img,
).images[0]

image.save("generated_city.jpg")

📍 Geo-Awareness

Our model is able to synthesize based on high-level geography of a region:

🧑‍💻 Setup and Training

Style for OSM imagery is created using MapBox. The style file can be downloaded from here. The dataset can be downloaded from here. Look at train.md for details on setting up the environment and training models on your own data.

🐨 Model Zoo

Download GeoSynth models from the given links below:

Control	Location	Download Url
-	❌	Link
OSM	❌	Link
SAM	❌	Link
Canny	❌	Link
-	✅	Link
OSM	✅	Link
SAM	✅	Link
Canny	✅	Link

📑 Citation

@inproceedings{sastry2024geosynth,
  title={GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis},
  author={Sastry, Srikumar and Khanal, Subash and Dhakal, Aayush and Jacobs, Nathan},
  booktitle={IEEE/ISPRS Workshop: Large Scale Computer Vision for Remote Sensing (EARTHVISION),
  year={2024}
}

🔍 Additional Links

Check out our lab website for other interesting works on geospatial understanding and mapping:

Multi-Modal Vision Research Lab (MVRL) - Link
Related Works from MVRL - Link

Does the model working?

Hi ! This looks super nice but I can't succeed to use it.

I download https://huggingface.co/MVRL/GeoSynth/blob/main/sd-base-geosynth.ckpt

Impossible to load it with A1111:

changing setting sd_model_checkpoint to geosynth_sd_loc-v3.ckpt [6223d68433]: AttributeError
Traceback (most recent call last):
  File "/Users/tmp/code/stable-diffusion-webui/modules/options.py", line 165, in set
    option.onchange()
  File "/Users/tmp/code/stable-diffusion-webui/modules/call_queue.py", line 13, in f
    res = func(*args, **kwargs)
  File "/Users/tmp/code/stable-diffusion-webui/modules/initialize_util.py", line 181, in <lambda>
    shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
  File "/Users/tmp/code/stable-diffusion-webui/modules/sd_models.py", line 869, in reload_model_weights
    state_dict = get_checkpoint_state_dict(checkpoint_info, timer)
  File "/Users/tmp/code/stable-diffusion-webui/modules/sd_models.py", line 330, in get_checkpoint_state_dict
    res = read_state_dict(checkpoint_info.filename)
  File "/Users/tmp/code/stable-diffusion-webui/modules/sd_models.py", line 314, in read_state_dict
    sd = get_state_dict_from_checkpoint(pl_sd)
  File "/Users/tmp/code/stable-diffusion-webui/modules/sd_models.py", line 253, in get_state_dict_from_checkpoint
    pl_sd = pl_sd.pop("state_dict", pl_sd)
AttributeError: 'NoneType' object has no attribute 'pop'

So I convert it first in .safetensors. And it succeeds to load it.

Then I try a very simple text2image test with only this prompt Satellite image features a city neighborhood and default parameters.
I get this kind of error

      File "/Users/famille/Documents/_Seb/Bidouilles/code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
        return F.linear(input, self.weight, self.bias)
    RuntimeError: linear(): input and weight.T shapes cannot be multiplied (77x1024 and 768x320)

mvrl / geosynth Goto Github PK

geosynth's Introduction

GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis

⏭️ Next

🌏 Inference

📍 Geo-Awareness

🧑‍💻 Setup and Training

🐨 Model Zoo

📑 Citation

🔍 Additional Links

geosynth's People

Contributors

Stargazers

Watchers

Forkers

geosynth's Issues

Recommend Projects

Recommend Topics

Recommend Org