System Info transformers ve

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (HF/Accelerate) about transformers HOT 2 OPEN

sajastu commented on June 29, 2024

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (HF/Accelerate)

from transformers.

Comments (2)

younesbelkada commented on June 29, 2024

Hi @sajastu
I looked at the traceback of the issue as well as the code on the Hub, can you also add Phi3DecoderLayer in no_split_modules ? The error seemed to happen here: https://huggingface.co/microsoft/Phi-3-medium-4k-instruct/blob/main/modeling_phi3.py#L899

from transformers.

sajastu commented on June 29, 2024

Hey @younesbelkada, I added the Phi3DecoderLayer to the no_split_modules argument array, still getting kind of the same error, apparently on a different spot:

flash-attention package not found, consider installing for better performance: /home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/flash_attn_2_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c104impl3cow11cow_deleterEPv.
Current flash-attenton does not support window_size. Either upgrade or use attn_implementation='eager'.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
You are not running the flash-attention implementation, expect numerical differences.
Traceback (most recent call last):
File "test.py", line 55, in
outputs = model.generate(batch['input_ids'][0], max_new_tokens=50)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/transformers/generation/utils.py", line 1758, in generate
result = self._sample(
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/transformers/generation/utils.py", line 2397, in _sample
outputs = self(
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/disk1/sasha/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-medium-4k-instruct/d194e4e74ffad5a5e193e26af25bcfc80c7f1ffc/modeling_phi3.py", line 1286, in forward
outputs = self.model(
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/disk1/sasha/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-medium-4k-instruct/d194e4e74ffad5a5e193e26af25bcfc80c7f1ffc/modeling_phi3.py", line 1164, in forward
layer_outputs = decoder_layer(
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/disk1/sasha/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-medium-4k-instruct/d194e4e74ffad5a5e193e26af25bcfc80c7f1ffc/modeling_phi3.py", line 885, in forward
attn_outputs, self_attn_weights, present_key_value = self.self_attn(
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/disk1/sasha/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-medium-4k-instruct/d194e4e74ffad5a5e193e26af25bcfc80c7f1ffc/modeling_phi3.py", line 383, in forward
key_states, value_states = past_key_value.update(key_states, value_states, self.layer_idx, cache_kwargs)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/transformers/cache_utils.py", line 155, in update
self.key_cache[layer_idx] = torch.cat([self.key_cache[layer_idx], key_states], dim=-2)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument tensors in method wrapper_cat)

from transformers.

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (HF/Accelerate) about transformers HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent