Comments (3)
Thank you for your attention!
The model used in the paper, the open source model and the model used in ‘demo.py’ are consistent with the original model version. We set the prompt of the caption task to "Generate the detailed caption in English:"
Later on, we trained a chat version of the model using some publicly available data, but we haven't released the model weights yet. Due to maintenance cost considerations, only one chat version of the demo is currently retained, and the reserved online demo_chat uses the chat version of the model.
The "Generate" of the online demo_chat is used for the Caption task, and its prompt is fixed in our code to "Describe the image in as much detail as possible in English, including as many elements from the image as possible, but without repetition. Answer: ". Only the "Submit" button will input the content in the input box to the model.
from monkey.
@Yuliang-Liu Any plan to release the new model? because I think the performance of online demo is really good.
from monkey.
@Yuliang-Liu Any plan to release the new model? because I think the performance of online demo is really good.
Thank you for your attention. We have open-sourced the weights for Monkey-Chat. You can find in https://huggingface.co/echo840/Monkey-Chat
from monkey.
Related Issues (20)
- looking forward to TextMonkey model weight and sample code HOT 2
- Inconsistency in Performance: Inference Code Yields Poor Results Compared to Online Demo HOT 3
- run demo.py error HOT 1
- Does the TextMoney vit has pretrain model? HOT 9
- Training data HOT 1
- Online Demo HOT 2
- Pretrained weight for text monkey HOT 3
- textMonkey data release HOT 3
- TextMonkey问题 HOT 1
- A100 40G可以跑通训练吗?全参数SFT和LoRA我在A100 40G报OOM,我debug看到是self.visual.encode(images)就报OOM了 HOT 13
- Data Access HOT 2
- TextMonkey RuntimeError HOT 8
- 为什么文档理解的输入不是pdf或者doc文档,而是图片? HOT 1
- textmonkey支持多图输入吗 HOT 1
- Will Rico data be released? HOT 4
- How to finetune only one subnetwork using Deepspeed + Transformers
- How to finetune certain params via from HF's transformers, a
- vizwiz的准确率仅有37.62?表中的结果为61.2?QwenVL是35.2,请问是数据填写错误吗? HOT 8
- Get the embeddings of the image. HOT 1
- How to set gpu card for the demo project running HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from monkey.