Git Product home page Git Product logo

Comments (4)

Sleepychord avatar Sleepychord commented on September 20, 2024

您好,这是pytorch tensor的类型,请查阅pytorch教程

from visualglm-6b.

iamsile avatar iamsile commented on September 20, 2024

@Sleepychord 您好,可能我没有表达清楚,我的意思是torch_image.to(self.dtype).to(self.device),self.dtype的类型此时是什么,是float16,还是float32,你能方便打印看一下嘛?

from visualglm-6b.

Sleepychord avatar Sleepychord commented on September 20, 2024

是因为想支持多种不同类型,所以这里根据self的类型判断。默认应该是fp16

from visualglm-6b.

iamsile avatar iamsile commented on September 20, 2024

@Sleepychord 您好,我在基于visualglm-6b训练reward model时遇到了一个错误,框架是用的是deepspeed_chat,具体报错如下:
Beginning of Epoch 1/1, Total Micro Batches 30502
Traceback (most recent call last):
File "/xxxxx/deepspeed_chat/training/step2_reward_model_finetuning/main.py", line 472, in
main()
File "/xxxxx/deepspeed_chat/training/step2_reward_model_finetuning/main.py", line 393, in main
outputs = rm_model(**batch, use_cache=False)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1731, in forward
loss = self.module(*inputs, **kwargs)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/xxxxx/deepspeed_chat/training/utils/model/reward_model.py", line 53, in forward
transformer_outputs = self.rwtranrsformer(
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/THUDM/visualglm-6b/533d4ae86f232b1d7d04417398f572f71751c77d/modeling_chatglm.py", line 1462, in forward
image_embeds = self.image_encoder(images)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/THUDM/visualglm-6b/533d4ae86f232b1d7d04417398f572f71751c77d/visual.py", line 69, in forward
enc = self.vit(image)[0]
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/THUDM/visualglm-6b/533d4ae86f232b1d7d04417398f572f71751c77d/visual.py", line 28, in forward
return super().forward(input_ids=input_ids, position_ids=None, attention_mask=attention_mask, image=image)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/sat/model/base_model.py", line 144, in forward
return self.transformer(*args, **kwargs)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/sat/model/transformer.py", line 569, in forward
layer_ret = layer(*args, layer_id=torch.tensor(i), **kw_args, **output_cross_layer,
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/sat/model/transformer.py", line 330, in forward
return HOOKS_DEFAULT['layer_forward'](self, hidden_states, mask, *args, **kw_args)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/sat/transformer_defaults.py", line 127, in layer_forward_default
attention_output = self.attention(attention_input, mask, **kw_args)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/sat/model/transformer.py", line 103, in forward
return HOOKS_DEFAULT['attention_forward'](self, hidden_states, mask, **kw_args)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/sat/transformer_defaults.py", line 63, in attention_forward_default
context_layer = attention_fn(query_layer, key_layer, value_layer, mask, dropout_fn, **kw_args)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/sat/transformer_defaults.py", line 38, in standard_attention
with mpu.get_cuda_rng_tracker().fork():
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/contextlib.py", line 113, in enter
return next(self.gen)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/deepspeed/runtime/activation_checkpointing/checkpointing.py", line 174, in fork
raise Exception('cuda rng state {} is not added'.format(name))
Exception: cuda rng state model-parallel-rng is not added

麻烦您帮忙看一下,期待您的回复

from visualglm-6b.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.