I can run the example code. But how to run the model to generate the some images and a

How to use ImageBind to generate image or audio? about imagebind HOT 10 OPEN

facebookresearch commented on June 23, 2024 4

How to use ImageBind to generate image or audio?

from imagebind.

Comments (10)

SoftologyPro commented on June 23, 2024 10

Agreed. How can you guys spend all that time training the model and writing the paper and setting up the demo website and not spend a few hours giving working example scripts to show us how to use it?

from imagebind.

ikuinen commented on June 23, 2024 5

I am also interested in this. Any news? Also, how can you retrieve an image based on image and audio/text? I am referring to the embedding space arithmetic examples in Figure 4 in the paper. Do you just sum the image embeddings with the audio/text embedding and perform cosine similarity with all the image embeddings and get the most similar image? Thanks!

We made a quick attempt: https://github.com/sail-sg/BindDiffusion

from imagebind.

Zeqiang-Lai commented on June 23, 2024 2

See also Anything2Image and InternGPT, it is implemented with Diffusers.

from imagebind.

Zeqiang-Lai commented on June 23, 2024 2

I'm rather new to diffusion, but does Imagebind provide any sort of decoder? I thought it was just training an encoder, and if that's the case how are these diffusion methods working?

Maybe this could help Zeqiang-Lai/Anything2Image#4

from imagebind.

echo-lalia commented on June 23, 2024 1

I don't think the model can actually generate those things; I think it just 'translates' the information from one form to another. I think it'll have to be built into an extension for SD-WebUI or something, in order to let us play with it more easily.

from imagebind.

WilTay1 commented on June 23, 2024

I don't think the model can actually generate those things; I think it just 'translates' the information from one form to another. I think it'll have to be built into an extension for SD-WebUI or something, in order to let us play with it more easily.

But the model can be downloaded and loaded in the script.

from imagebind.

bakachan19 commented on June 23, 2024

I am also interested in this. Any news?
Also, how can you retrieve an image based on image and audio/text? I am referring to the embedding space arithmetic examples in Figure 4 in the paper.
Do you just sum the image embeddings with the audio/text embedding and perform cosine similarity with all the image embeddings and get the most similar image?
Thanks!

from imagebind.

SoftologyPro commented on June 23, 2024

See also Anything2Image , it is implemented with Diffusers.

This works well with a nice gradio GUI interface.

from imagebind.

ChloeL19 commented on June 23, 2024

I'm rather new to diffusion, but does Imagebind provide any sort of decoder? I thought it was just training an encoder, and if that's the case how are these diffusion methods working?

from imagebind.

celster commented on June 23, 2024

I'm rather new to diffusion, but does Imagebind provide any sort of decoder? I thought it was just training an encoder, and if that's the case how are these diffusion methods working?

Maybe this could help Zeqiang-Lai/Anything2Image#4

This is great!!
I'm also looking for "Image+Text --> Image". For example, take a photo and ask to perform some augmentation to the person on the photo (e.g. makeup).

from imagebind.

How to use ImageBind to generate image or audio? about imagebind HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent