Comments (12)
same question.
from monkey.
same question.
请问你解决了么?
from monkey.
No. I haven't solved it yet. Actually, I gave up🥲 It's difficult to use the translator. Please speak in English...
from monkey.
from monkey.
There are two types of a100: 40GB and 80GB. The author seems to have trained with 80GB.
from monkey.
When I use zero3, there is a situation where the loss is larger than when I use zero2. It seems that the model did not initialize successfully. Have you ever encountered this when using zero3?
The first picture shows the loss of zero3 and the second picture shows the loss of zero2, and there is a parameter mismatch issue during model initialization, as follows:
from monkey.
Training has been stuck at loading the base model, and I haven't been able to successfully train yet.
from monkey.
Does it training from scratch or loading the textmoney weights finetune?
from monkey.
Does it training from scratch or loading the textmoney weights finetune?
loading the monkey weights finetune
from monkey.
This is not good, we need train from scratch
from monkey.
Could you please give more detail information about your training? Our model use the Qwen-VL as pretrained model and it can work well in 8xA800 80G with ZeRO2.
from monkey.
Could you please give more detail information about your training? Our model use the Qwen-VL as pretrained model and it can work well in 8xA800 80G with ZeRO2.您能提供有关您的培训的更多详细信息吗?我们的模型使用 Qwen-VL 作为预训练模型,它可以在 8xA800 80G 和 ZeRO2 中很好地工作。
40G 8卡 A100
from monkey.
Related Issues (20)
- Inconsistency in Performance: Inference Code Yields Poor Results Compared to Online Demo HOT 3
- run demo.py error HOT 1
- Does the TextMoney vit has pretrain model? HOT 9
- Training data HOT 1
- Online Demo HOT 2
- Pretrained weight for text monkey HOT 3
- textMonkey data release HOT 3
- TextMonkey问题 HOT 1
- A100 40G可以跑通训练吗?全参数SFT和LoRA我在A100 40G报OOM,我debug看到是self.visual.encode(images)就报OOM了 HOT 13
- Data Access HOT 2
- TextMonkey RuntimeError HOT 8
- 为什么文档理解的输入不是pdf或者doc文档,而是图片? HOT 1
- textmonkey支持多图输入吗 HOT 1
- Will Rico data be released? HOT 4
- How to finetune only one subnetwork using Deepspeed + Transformers
- How to finetune certain params via from HF's transformers, a
- vizwiz的准确率仅有37.62?表中的结果为61.2?QwenVL是35.2,请问是数据填写错误吗? HOT 8
- Get the embeddings of the image. HOT 1
- How to set gpu card for the demo project running HOT 5
- Textmonkey有推理代码吗,为什么web demo运行起来不回答 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from monkey.