Comments (2)
Hi @sajastu
I looked at the traceback of the issue as well as the code on the Hub, can you also add Phi3DecoderLayer
in no_split_modules
? The error seemed to happen here: https://huggingface.co/microsoft/Phi-3-medium-4k-instruct/blob/main/modeling_phi3.py#L899
from transformers.
Hey @younesbelkada, I added the Phi3DecoderLayer
to the no_split_modules
argument array, still getting kind of the same error, apparently on a different spot:
flash-attention
package not found, consider installing for better performance: /home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/flash_attn_2_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c104impl3cow11cow_deleterEPv.
Currentflash-attenton
does not supportwindow_size
. Either upgrade or useattn_implementation='eager'
.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
You are not running the flash-attention implementation, expect numerical differences.
Traceback (most recent call last):
File "test.py", line 55, in
outputs = model.generate(batch['input_ids'][0], max_new_tokens=50)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/transformers/generation/utils.py", line 1758, in generate
result = self._sample(
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/transformers/generation/utils.py", line 2397, in _sample
outputs = self(
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/disk1/sasha/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-medium-4k-instruct/d194e4e74ffad5a5e193e26af25bcfc80c7f1ffc/modeling_phi3.py", line 1286, in forward
outputs = self.model(
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/disk1/sasha/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-medium-4k-instruct/d194e4e74ffad5a5e193e26af25bcfc80c7f1ffc/modeling_phi3.py", line 1164, in forward
layer_outputs = decoder_layer(
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/disk1/sasha/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-medium-4k-instruct/d194e4e74ffad5a5e193e26af25bcfc80c7f1ffc/modeling_phi3.py", line 885, in forward
attn_outputs, self_attn_weights, present_key_value = self.self_attn(
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/disk1/sasha/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-medium-4k-instruct/d194e4e74ffad5a5e193e26af25bcfc80c7f1ffc/modeling_phi3.py", line 383, in forward
key_states, value_states = past_key_value.update(key_states, value_states, self.layer_idx, cache_kwargs)
File "/home/sasha/anaconda3/envs/myenv-py38/lib/python3.8/site-packages/transformers/cache_utils.py", line 155, in update
self.key_cache[layer_idx] = torch.cat([self.key_cache[layer_idx], key_states], dim=-2)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument tensors in method wrapper_cat)
from transformers.
Related Issues (20)
- Problem with the masked language modeling tutorial HOT 1
- When running `ruff format src/transformers`, some files needs to be reformatted HOT 2
- Something wrong for `StoppingCriteria` HOT 5
- Index out of range when generate using optimum HOT 1
- Fail to load model without .safetensors file
- GGUFTokenizerSkeleton AttributeError during conversion HOT 3
- Fixing Tensor Shape/Dimension Mismatch Errors in TimeSeries Transformer for Stock Price Prediction HOT 9
- You can't train a model that has been loaded with `device_map='auto'` in any distributed mode. HOT 2
- NotImplementedError: Cannot copy out of meta tensor; no data when embedding to meta HOT 3
- Add argument to set number of eval steps in Trainer HOT 2
- First token optimization in beam search
- Transformers master version breaks compatibility with `torch<2.3` HOT 1
- Missing upper bound in numpy requirements breaks transformers HOT 5
- Trainer: To keep unused columns for `compute_metrics` HOT 1
- RuntimeError: slow_conv2d_forward_mps: input(device='cpu') and weight(device=mps:0') HOT 1
- OOM when loading 300B models with `AutoModelForCausalLM.from_pretrained` and `BitsAndBytesConfig` quantization. HOT 1
- A question about the implementation of Sinkcache. HOT 2
- Multi-GPU inference affects LLM's (Llama2-7b-chat-hf) generation.
- `pip install accelerate` (and similar) error messages should specify min version HOT 1
- Incorrect docstring of `get_anyres_image_grid_shape` HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.