Comments (3)
Hi! Then for the demo of Audio-to-Image generation showcased on the website, I‘m wondering which generative model is used, and whether you plan to release the corresponding code. Thank you!
from imagebind.
Thanks for your question. ImageBind learns a shared embeddings space across modalities, therefore it allows retrieval across modalities. If by conversion you mean generation, ImageBind features can be fed to other generation models (e.g. Stable diffusion), but it doesn't generate raw signals on its own.
Our models already supports Video. Video features can be extracted using load_and_transform_video_data
Line 297 in 0f8620b
ModalityType.VISION
in a similar manner to images. Please let us know if you any other questions.from imagebind.
Hi! Then for the demo of Audio-to-Image generation showcased on the website, I‘m wondering which generative model is used, and whether you plan to release the corresponding code. Thank you!
We have a quick application on top of ImageBind: https://github.com/sail-sg/BindDiffusion It wires up ImageBind with stable diffusion. Go ahead and have a try.
from imagebind.
Related Issues (20)
- 多模态数据对
- `load_and_transform_text` method exec failed HOT 1
- Something wrong with EncodedVideo in load_and_transform_video_data HOT 2
- 预训练模型的输出问题
- Custom sensor as one of the multimodality? HOT 1
- Question regarding SelectElement(index=0) in the modality heads HOT 1
- Using Depth Embeddings in NyuV2 Zero-Shot Classification HOT 4
- Directly using images from S3 bucket using URL.
- Train/Val Split for LLVIP and IMU HOT 1
- Same vector embedding output for different text inputs HOT 3
- Inconsistent Statement Regarding Experiments on NYU-Depth-v2 HOT 2
- Checkpoints for small/medium model
- Imagebind for commercial purposes
- Simply replacing Detic's CLIP-based ‘class’ enbedding with imagebind audio embedding HOT 1
- How to use ImageBind to locate sound sources in video?
- issue building wheel for cartopy (Windows 11) HOT 3
- 3 and more modalities in one model HOT 1
- What is your perspective on LanguageBind surpassing ImageBind? HOT 1
- Questions for demo sites audio and image data usage.
- Initialization of Thermal backbone
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from imagebind.